8. Retrieving vertices
// Get a traverser so that we can run some queries
g = graph.traversal(standard())
gremlin> g.V()
==>v[0]
==>v[2]
==>v[4]
==>v[6]
// Get the properties for each vertex
gremlin> g.V().valueMap()
==>[name:[Acme Trucks]]
==>[firstName:[Susan]]
==>[firstName:[Tom]]
==>[license:[ABC123], year:[2012]]
9. Basic vertex filtering
// Retrieve all people with firstName Susan
gremlin> g.V().hasLabel("person").has("firstName", "Susan")
==>v[2]
// Retrieve all people with firstName Susan or Tom
gremlin> g.V().hasLabel("person").has("firstName", within("Susan", "Tom"))
==>v[2]
==>v[4]
10. Querying adjacent edges and vertices
// Count how many people Acme Trucks employs
gremlin> g.V().hasLabel("company").has("name", "Acme Trucks").out("employs").count()
==>2
// How many employees were hired in 2012?
gremlin> g.V().hasLabel("person").where(inE("employs").has("hired", 2012)).count()
==>1
// Which employees drives a truck?
gremlin> g.V().hasLabel("company").has("name", "Acme Trucks").out("employs").as("driver").out("drives").select
("driver").values("firstName")
==>Tom
// Show me all of the drivers that were hired before 2015
gremlin> g.V().hasLabel("person").and(inE("employs").values("hired").is(lt(2015)), out("drives")).values("firstName")
==>Tom
11. Many more steps...
● AddEdge Step
● AddVertex Step
● AddProperty Step
● Aggregate Step
● And Step
● As Step
● By Step
● Cap Step
● Coalesce Step
● Count Step
● Choose Step
● Coin Step
● CyclicPath Step
● Dedup Step
● Drop Step
● Fold Step
● Group Step
● GroupCount Step
● Has Step
● Inject Step
● Is Step
● Limit Step
● Local Step
● Match Step
● ...
12. GraphComputer for global graph processing
● Use cases
○ full graph traversal
○ parallel processing
○ batch import/export
● Examples
○ PageRank
○ vertex count
○ mass schema update
● Gremlin OLAP implementations
○ Hadoop
○ Spark
○ Giraph
13. Graph use cases
● Social network analysis
● Fraud detection
● Recommendation systems
● Route optimization
● IoT
● Master data management
14.
15. TitanDB
● What is Titan?
● Data store options
● Deployment options
● Titan Cassandra data model
● Titan specific graph features
16. TitanDB
● Graph layer that can use a variety of data stores as backends depending
on user requirements
○ HBase
○ Berkeley DB
○ Cassandra
○ Insert your favorite k/v, BigTable data store
17. Which data store is right for you?
● Things to think about
○ data volume
○ CAP
○ ACID
○ read/write requirements
○ ops implications
○ your current infrastructure
http://s3.thinkaurelius.com/docs/titan/0.5.4/benefits.html
19. A Titan cluster with access options
Titan
C*
Titan
C*
Titan
C*
Titan
C*
Titan
C*
● Access options
○ Titan < 0.9
■ Rexster
■ dependency of your app
○ Titan 0.9+
■ Gremlin server
■ dependency of your app
○ Object to graph mapper
■ Python - Mogwai, Bulbs
■ JVM - Totorom, Frames
● Titan does not need to be on each
node, all communication between
Titan instances is through C*
20. Titan installation
● Download and unzip latest milestone
● Cassandra footprint
○ Titan keyspace
○ Column families
■ edgestore
■ edgestore_lock_
■ graphindex
■ graphindex_lock_
■ titan_ids
■ ...
./bin/titan.sh start
Forking Cassandra...
Running `nodetool statusthrift`.. OK (returned exit
status 0 and printed string "running").
Forking Elasticsearch...
Connecting to Elasticsearch (127.0.0.1:9300). OK
(connected to 127.0.0.1:9300).
Forking Gremlin-Server...
Connecting to Gremlin-Server (127.0.0.1:8182)...... OK
(connected to 127.0.0.1:8182).
Run gremlin.sh to connect.
21. Vertex and edge storage format
Cassandra
Thrift
Titan storage
format
23. Schema definition
● Properties
○ data type - string, float, char, geoshape, etc.
○ cardinality - single, list, set
○ uniqueness (through Titan’s indexing system)
● Edges
○ labels
○ define multiplicity - one-to-one, many-to-one, one-to-many
● Vertices
○ labels
● Advanced
○ edge, vertex, and property TTL
○ Multi-properties - properties on properties (audit info for example)
24. Global indexing options
● Supports composite keys
● Titan indexing provider
○ fast!
○ exact matches only
● External providers
○ Not as fast
○ Many options beyond exact
matching (wildcards,
geosearch, etc.)
○ providers
■ Elastic Search
■ Lucene
■ Solr
I want that one!
25. Vertex Centric Indices
● Adjacent edge counts can grow
quite large in certain situations
and form super nodes
● Supports composite keys and
ordering of edges to speed up
vertex centric queries
○ translates into slice queries of
the edges
○ efficiently retrieve ranges of
edges or satisfy top n type
queries
company
name: Acme
Trucks
employs
hired: 2013
employs
hired: 2014
employs
hired: 2015
30. A bit more about WellAware
● Founded in 2012
● Full stack oil & gas monitoring solution
● iOS, Android, and web clients
● Connecting to field assets over RPMA, cellular, and
satellite
31. Functionality and high level architecture
● Remote data collection
● Mobile data collection
● Asset control
● Derived measurements
● Alarming
● Reporting
Poller Django
Titan
WAN ESB
32. Moving to Titan
● 2013
○ Running Django against PostgreSQL and for awhile, TempoDB
● Beginning of 2014 - started using Titan 0.4.4 to capture relationships
between assets and for derived measurements
● March 2014 - deployed a 3 node Cassandra cluster and moved the rest of
the backend (minus auth) over to Titan 0.4.4
● Today - 3 node DC for OLTP & 2 node reporting DC
○ still on Titan 0.4.4, waiting for Titan 1.0 to be released and hardened
○ post Titan 1.0, we’re looking forward to trying out DSE Graph
33. A common well pad configuration
Well & pumpjack
Tanks
35. Zooming in on a well pad
wellmeter separator
meter
tank
tank
compressor
36. Lessons learned
● No native integration with 3rd party BI tools - reports, dashboards, ad hoc
query
○ Apache Calcite based jdbc driver that translates SQL to graph queries
● Colocation of Titan, some of your application code, and Cassandra on the
same nodes, what’s the right separation?
● Out of the box framework support is lacking (no native Spring, Dropwizard
support)
● Performance tuning requires knowledge of Titan AND Cassandra
● Play to Cassandra and adjacency list storage format strengths
● You can’t hide from tombstones!!!
37. Graph and Titan resources
● Tinkerpop docs - http://www.tinkerpop.com/docs/3.0.0.M6/
● Titan docs - http://s3.thinkaurelius.com/docs/titan/0.9.0-M2/
● Titan Google group - https://groups.google.com/forum/#!
forum/aureliusgraphs
● Gremlin Google group - https://groups.google.com/forum/#!forum/gremlin-
users
● O’Reilly graph ebook (focuses on Neo4j but has generally applicable graph
info) - http://graphdatabases.com/
● Java OGM - https://github.com/BrynCooke/totorom
● Python OGM - https://mogwai.readthedocs.org/en/latest/