SlideShare a Scribd company logo
TinkerPop: a story of
graphs, DBs, and graph DBs
Joshua Shinavier and James Thornton
Texas Linux Festival
June 13th, 2014
Once, there was a thing
v(1)
Let’s call it a vertex
The vertex had
some metadata
v(1)
name: “Graph DB workshop”
We’ll call that a property
v(1)
name: “Graph DB workshop”
You are here.
In fact, the vertex had
multiple properties
v(1)
name: “Graph DB workshop”
type: “Event”
The properties were
of various types
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
Our vertex was not alone
Thus, an edge
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
The edge was directed…
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
…and labeled
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
The label types the relationship
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
You are here, too.
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
More vertices joined the fun…
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
hasTopic
More labels, too
Now it was a labeled multigraph
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(6)
name: “Joshua Shinavier”
type: “Person”
githubId: “joshsh”
v(5)
presentedBy
presentedBy
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
hasTopic
name: “James Thornton”
type: “Person”
githubId: “espeed”
A few more edges
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(6)
name: “Joshua Shinavier”
type: “Person”
githubId: “joshsh”
v(5)
presentedBy
presentedBy
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
contributesTo
contributesTo
hasTopic
contributesTo
name: “James Thornton”
type: “Person”
githubId: “espeed”
Some edges also had properties
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(6)
name: “Joshua Shinavier”
type: “Person”
githubId: “joshsh”
v(5)
presentedBy
presentedBy
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
contributesTo
contributesTo
hasTopic
contributesTo
weight: 0.2
weight: 0.8
name: “James Thornton”
type: “Person”
githubId: “espeed”
weight: 1.0
We call this a Property Graph
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(6)
name: “Joshua Shinavier”
type: “Person”
githubId: “joshsh”
v(5)
presentedBy
presentedBy
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
contributesTo
contributesTo
hasTopic
contributesTo
weight: 0.2
weight: 0.8
name: “James Thornton”
type: “Person”
githubId: “espeed”
weight: 1.0
Many graph DB data models
are variations on this theme
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type: “Event”
starts: 1402664400000
ends: 1402808400000
partOf
v(3)
name: “Chef Workshop”
type: “Event”
starts: 1402664400000
ends: 1402696800000
v(4)
name: “Canonical Charm School”
type: “Event”
starts: 1402664400000
ends: 1402696800000
partOf
partOf
v(6)
name: “Joshua Shinavier”
type: “Person”
githubId: “joshsh”
v(5)
presentedBy
presentedBy
v(7)
name: “TinkerPop suite”
type: “Software”
hasTopic
v(8)
name: “Aurelius Graph Cluster”
type: “Software”
contributesTo
contributesTo
hasTopic
contributesTo
weight: 0.2
weight: 0.8
name: “James Thornton”
type: “Person”
githubId: “espeed”
weight: 1.0
Neo4j
OrientDB
Sparksee*
* the graph database previously known as DEX
etc.
Enter
• single Property Graph API supported by diverse
graph database backends
• choose your favorite, but avoid vendor lock-in
• Blueprints : graph DB :: JDBC : RDBMS
• implementations, “ouplementations”, test suites,
and helper utilities are built on top
Blueprints implementations
Now we need a
query language…
• build it on the Blueprints API
• query over any Blueprints-compatible DB
• make it path-like, with side-effects
• match abstract traversals through the graph,
filtering, ranking, and mutating as you go
• make it interactive. How about a REPL?
• a domain-specific language for traversing graphs
• Turing-complete, permits access to the full JDK
• has been adapted to various JVM languages
• Gremlin : graph DB :: SQL : RDBMS… sort of
Enter
Think “pipes and filters”
• Pipes: dataflow framework. The basis of Gremlin
• Frames: Java bean framework for graphs
• Furnace: Property Graph algorithms
• Rexster: high-performance graph database server
The rest of the TinkerPop family
TinkerPop is…
• a developer group creating an open-source graph DB
stack
• a community of users and third-party implementors
• a foundation for building high-performance graph
applications of any size
• model some data on your laptop
• build massive clustered applications
• open source, BSD licensed
A detailed guide to the
rest of this workshop
• intro to the Aurelius Graph Cluster
• demos of graph tools and concepts
• guided installation of tools
• preview of TinkerPop3
Thanks!
The Aurelius
Graph Cluster
In TinkerPop…
• we adapt various graph DBs to a unified API
• they become Property Graph databases
With AGC…
• we adapt various high-performance databases to
the Titan API
• they become graph databases
Take your pick of CAP
Titan highlights
• graphs, transactions scale with the number of
machines in a cluster
• geo, numeric range, and full text search for vertices
and edges
• support for either of two indexing backends
• ElasticSearch, Lucene
• native support for Blueprints, Rexster
Dealing with supernodes
• Titan’s vertex-centric indices permit ordered querying
from a vertex
• e.g. retrieve “knows” edges… in order of “since”
timestamp
• iterates efficiently, even if there are thousands of edges
What about
Faunus
Faunus…
• is a Hadoop-based graph analytics engine
• in Titan 0.5 will simply be called Titan/Hadoop
• adds support for global distributed graph
operations
• applies (a subset of) Gremlin in a breadth-first
fashion
Faunus inputs and outputs
• Hadoop SequenceFile format (in/out)
• Titan graph DB (in/out)
• GraphSON format (in/out)
• Rexster (in)
• RDF (in)
• Gremlin scripts (in/out)
Demo time
TinkerPop3
What’s new in TP3
• new Gremlin implementation which makes good use of
Java 8 closures, enables introspection and optimization of
traversals
• new OLAP API with support for message passing systems
like Giraph, Hama, Faunus, etc.
• revamped I/O utilities with support for GraphSON,
GraphML, and GremlinKryo
• new server model, incl. remote execution of scripts via
WebSocket API, server plugin support, customizable
serialization formats
Gremlitron
• Blueprints, Pipes, and
Gremlin are all integrated
in TinkerPop3
• Frames obsoleted by
Gremlin DSLs
• Furnace is Gremlin OLAP
• Rexster is Gremlin Server
Try it out
• at:
• https://github.com/tinkerpop/tinkerpop3
• mailing list:
• https://groups.google.com/forum/gremlin-users
• we welcome your feedback and/or PRs
josh@fortytwo.net james.thornton@gmail.com
http://tinkerpop.com

More Related Content

What's hot

AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
Amazon Web Services
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and development
Wes McKinney
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
Wes McKinney
 

What's hot (20)

Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
 
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
 
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
Text Analytics Summit 2009 - Roddy Lindsay - "Social Media, Happiness, Petaby...
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Scio
ScioScio
Scio
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and development
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Apache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityApache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce Compatibility
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
 
Overview of the Hive Stinger Initiative
Overview of the Hive Stinger InitiativeOverview of the Hive Stinger Initiative
Overview of the Hive Stinger Initiative
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
Scotland Data Science Meetup Oct 13, 2015: Spark SQL, DataFrames, Catalyst, ...
Scotland Data Science Meetup Oct 13, 2015:  Spark SQL, DataFrames, Catalyst, ...Scotland Data Science Meetup Oct 13, 2015:  Spark SQL, DataFrames, Catalyst, ...
Scotland Data Science Meetup Oct 13, 2015: Spark SQL, DataFrames, Catalyst, ...
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
 
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
 

Viewers also liked

Viewers also liked (13)

DataStax | Graph Computing with Apache TinkerPop (Marko Rodriguez) | Cassandr...
DataStax | Graph Computing with Apache TinkerPop (Marko Rodriguez) | Cassandr...DataStax | Graph Computing with Apache TinkerPop (Marko Rodriguez) | Cassandr...
DataStax | Graph Computing with Apache TinkerPop (Marko Rodriguez) | Cassandr...
 
Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and GremlinIntro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
Graph Processing with Titan and Scylla
Graph Processing with Titan and ScyllaGraph Processing with Titan and Scylla
Graph Processing with Titan and Scylla
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Adding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusAdding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and Faunus
 
WTF is the Semantic Web and Linked Data
WTF is the Semantic Web and Linked DataWTF is the Semantic Web and Linked Data
WTF is the Semantic Web and Linked Data
 
Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?Cassandra Summit - What's New In Apache TinkerPop?
Cassandra Summit - What's New In Apache TinkerPop?
 
Neo, Titan & Cassandra
Neo, Titan & CassandraNeo, Titan & Cassandra
Neo, Titan & Cassandra
 
Graph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPopGraph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPop
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 

Similar to TinkerPop: a story of graphs, DBs, and graph DBs

The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
Neo4j
 

Similar to TinkerPop: a story of graphs, DBs, and graph DBs (20)

Dev sum - Beyond REST with GraphQL in .Net
Dev sum - Beyond REST with GraphQL in .NetDev sum - Beyond REST with GraphQL in .Net
Dev sum - Beyond REST with GraphQL in .Net
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015
 
Edward King SPEDDEXES 2014
Edward King SPEDDEXES 2014Edward King SPEDDEXES 2014
Edward King SPEDDEXES 2014
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
A general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4JA general introduction to Spring Data / Neo4J
A general introduction to Spring Data / Neo4J
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
containerd summit - Deep Dive into containerd
containerd summit - Deep Dive into containerdcontainerd summit - Deep Dive into containerd
containerd summit - Deep Dive into containerd
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
 
Tapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and FlinkTapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and Flink
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with Python
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
About Clack
About ClackAbout Clack
About Clack
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
 
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 

More from Joshua Shinavier

More from Joshua Shinavier (14)

Anything-to-Graph
Anything-to-GraphAnything-to-Graph
Anything-to-Graph
 
Transpilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing HydraTranspilers Gone Wild: Introducing Hydra
Transpilers Gone Wild: Introducing Hydra
 
TinkerPop 2020
TinkerPop 2020TinkerPop 2020
TinkerPop 2020
 
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
 
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
 
In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)
 
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
 
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityBuilding an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
 
A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...
A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...
A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...
 
Evolution of the Graph Schema
Evolution of the Graph SchemaEvolution of the Graph Schema
Evolution of the Graph Schema
 
Semantics and Sensors
Semantics and SensorsSemantics and Sensors
Semantics and Sensors
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.org
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

TinkerPop: a story of graphs, DBs, and graph DBs

  • 1. TinkerPop: a story of graphs, DBs, and graph DBs Joshua Shinavier and James Thornton Texas Linux Festival June 13th, 2014
  • 2. Once, there was a thing
  • 4. The vertex had some metadata v(1) name: “Graph DB workshop”
  • 5. We’ll call that a property v(1) name: “Graph DB workshop” You are here.
  • 6. In fact, the vertex had multiple properties v(1) name: “Graph DB workshop” type: “Event”
  • 7. The properties were of various types v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000
  • 8. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 Our vertex was not alone
  • 9. Thus, an edge v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000
  • 10. The edge was directed… v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000
  • 11. …and labeled v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf
  • 12. The label types the relationship v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf You are here, too.
  • 13. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf More vertices joined the fun…
  • 14. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” hasTopic More labels, too
  • 15. Now it was a labeled multigraph v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” hasTopic name: “James Thornton” type: “Person” githubId: “espeed”
  • 16. A few more edges v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo name: “James Thornton” type: “Person” githubId: “espeed”
  • 17. Some edges also had properties v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  • 18. We call this a Property Graph v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  • 19. Many graph DB data models are variations on this theme v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  • 20. Neo4j
  • 22. Sparksee* * the graph database previously known as DEX
  • 23. etc.
  • 24. Enter • single Property Graph API supported by diverse graph database backends • choose your favorite, but avoid vendor lock-in • Blueprints : graph DB :: JDBC : RDBMS • implementations, “ouplementations”, test suites, and helper utilities are built on top
  • 26. Now we need a query language… • build it on the Blueprints API • query over any Blueprints-compatible DB • make it path-like, with side-effects • match abstract traversals through the graph, filtering, ranking, and mutating as you go • make it interactive. How about a REPL?
  • 27. • a domain-specific language for traversing graphs • Turing-complete, permits access to the full JDK • has been adapted to various JVM languages • Gremlin : graph DB :: SQL : RDBMS… sort of Enter
  • 28. Think “pipes and filters”
  • 29. • Pipes: dataflow framework. The basis of Gremlin • Frames: Java bean framework for graphs • Furnace: Property Graph algorithms • Rexster: high-performance graph database server The rest of the TinkerPop family
  • 30. TinkerPop is… • a developer group creating an open-source graph DB stack • a community of users and third-party implementors • a foundation for building high-performance graph applications of any size • model some data on your laptop • build massive clustered applications • open source, BSD licensed
  • 31. A detailed guide to the rest of this workshop • intro to the Aurelius Graph Cluster • demos of graph tools and concepts • guided installation of tools • preview of TinkerPop3
  • 34. In TinkerPop… • we adapt various graph DBs to a unified API • they become Property Graph databases
  • 35. With AGC… • we adapt various high-performance databases to the Titan API • they become graph databases
  • 36. Take your pick of CAP
  • 37. Titan highlights • graphs, transactions scale with the number of machines in a cluster • geo, numeric range, and full text search for vertices and edges • support for either of two indexing backends • ElasticSearch, Lucene • native support for Blueprints, Rexster
  • 38. Dealing with supernodes • Titan’s vertex-centric indices permit ordered querying from a vertex • e.g. retrieve “knows” edges… in order of “since” timestamp • iterates efficiently, even if there are thousands of edges
  • 40. Faunus… • is a Hadoop-based graph analytics engine • in Titan 0.5 will simply be called Titan/Hadoop • adds support for global distributed graph operations • applies (a subset of) Gremlin in a breadth-first fashion
  • 41. Faunus inputs and outputs • Hadoop SequenceFile format (in/out) • Titan graph DB (in/out) • GraphSON format (in/out) • Rexster (in) • RDF (in) • Gremlin scripts (in/out)
  • 44. What’s new in TP3 • new Gremlin implementation which makes good use of Java 8 closures, enables introspection and optimization of traversals • new OLAP API with support for message passing systems like Giraph, Hama, Faunus, etc. • revamped I/O utilities with support for GraphSON, GraphML, and GremlinKryo • new server model, incl. remote execution of scripts via WebSocket API, server plugin support, customizable serialization formats
  • 45. Gremlitron • Blueprints, Pipes, and Gremlin are all integrated in TinkerPop3 • Frames obsoleted by Gremlin DSLs • Furnace is Gremlin OLAP • Rexster is Gremlin Server
  • 46. Try it out • at: • https://github.com/tinkerpop/tinkerpop3 • mailing list: • https://groups.google.com/forum/gremlin-users • we welcome your feedback and/or PRs