Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gremlin's Anatomy

1,164 views

Published on

Gremlin is the graph traversal language of Apache TinkerPop, an open source graph computing framework, that is implemented by a great many graph databases, including DSE Graph. Even the most novice Gremlin user will recognize the Gremlin statement of "g.V()", but in this presentation we will stop to take a moment to understand the elements of that ubiquitous statement and the elements of the steps that append to it. With the foundational knowledge of "Gremlin's Anatomy" firmly held, we will perform an autopsy on an advanced Gremlin traversal and thus expose techniques for examining and taming the most complex and confusing Gremlin one might come across.

Published in: Technology
  • Be the first to comment

Gremlin's Anatomy

  1. 1. Gremlin's Anatomy Stephen Mallette @spmallette © 2018. All Rights Reserved.
  2. 2. © 2018. All Rights Reserved. What is a graph?
  3. 3. © 2018. All Rights Reserved. ● A property graph is a collection of vertices and edges ○ A vertex represents entities/domain objects ○ An edge represents a directional relationship between two vertices ○ Vertices and edges have labels and properties label: person name: Stephen label: company name: DataStax label: person name: Paras label: employs since: 2015 label: employs since: 2016 label: knows
  4. 4. © 2018. All Rights Reserved. How does Apache TinkerPop support graph processing?
  5. 5. © 2018. All Rights Reserved.
  6. 6. © 2018. All Rights Reserved. Gremlin's Anatomy refers to the study of the Gremlin graph traversal language at a level beyond just reference to the set of steps that define the language itself.
  7. 7. © 2018. All Rights Reserved. ● Reasons to study Gremlin's Anatomy ○ Write better Gremlin ○ Improve ability to read more complex Gremlin ○ Ask more concise questions if you get stuck ○ Develop more robust and expressive Domain Specific Languages
  8. 8. © 2018. All Rights Reserved. g.V() GraphTraversalSource ● Spawns GraphTraversal instances with start steps ● Holds configurations that Gremlin will use in the traversal
  9. 9. © 2018. All Rights Reserved. g.V(). has('movie', 'title', 'Young Guns'). outE(). groupCount() GraphTraversal ● The steps that make up the Gremlin language ● Each step returns a GraphTraversal so that steps can be chained together ● The output of one step becomes the input to the next
  10. 10. © 2018. All Rights Reserved. g.V(). has('movie', 'title', 'Young Guns'). outE(). groupCount(). by() Step Modulators ● Step modulators look back on the previous step and modifies their behavior
  11. 11. © 2018. All Rights Reserved. g.V(). has('movie', 'title', 'Young Guns'). outE(). groupCount(). by(label()) Anonymous Traversal ● A traversal not bound to a GraphTraversalSource ● Spawned from the double underscore class (e.g __.label()) ● Usually exposed as standalone functions for better readability
  12. 12. © 2018. All Rights Reserved. g.V(). has('movie', 'title', within('Young Guns', 'Hot Shots!')). outE(). groupCount(). by(label()) Expressions ● Typically refers to string tokens, enums or Predicate values (i.e. anything that makes Gremlin more readable) ● Be aware of naming collisions with steps and other expressions (e.g. P.not() and not() step).
  13. 13. © 2018. All Rights Reserved. g.V(). has('movie', 'title', within('Young Guns', 'Hot Shots!')). outE(). groupCount(). by(label()).next() Terminal Steps ● Steps that do not return a traversal and instead return the traversal result. ● Commonly used terminal steps include hasNext(), next(), iterate(), toList(). ● If you don't have a terminal step, you don't have a result. You just have a GraphTraversal.
  14. 14. © 2018. All Rights Reserved. The Gremlin Debugging Cycle presents methods and techniques that provide insight into Gremlin statements of any complexity for purposes of learning, debugging, and crafting traversals.
  15. 15. © 2018. All Rights Reserved. https://stackoverflow.com/a/47680050/1831717 g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a')
  16. 16. label: rates tag: ruby value: 8 label: rates tag: ruby value: 9 © 2018. All Rights Reserved. Sample Graph name: alice name: bobby name: cindy label: rates tag: ruby value: 7 name: david label: rates tag: ruby value: 6 name: eliza label: rates tag: java value: 10
  17. 17. © 2018. All Rights Reserved. ,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin>
  18. 18. © 2018. All Rights Reserved. gremlin> graph = TinkerGraph.open() ==>tinkergraph[vertices:0 edges:0] gremlin> g = graph.traversal() ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard] gremlin> g.addV().property('name','alice').as('a'). ......1> addV().property('name','bobby').as('b'). ......2> addV().property('name','cindy').as('c'). ......3> addV().property('name','david').as('d'). ......4> addV().property('name','eliza').as('e'). ......5> addE('rates').from('a').to('b').property('tag','ruby').property('value',9). ......6> addE('rates').from('b').to('c').property('tag','ruby').property('value',8). ......7> addE('rates').from('c').to('d').property('tag','ruby').property('value',7). ......8> addE('rates').from('d').to('e').property('tag','ruby').property('value',6). ......9> addE('rates').from('e').to('a').property('tag','java').property('value',10). .....10> iterate() gremlin> graph ==>tinkergraph[vertices:5 edges:5]
  19. 19. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack(div). .....19> by(union(constant(1d), .....20> tail(local, 1)).sum()). .....21> sack(mult). .....22> by(limit(local, 1)). .....23> sack().sum()). .....24> cap('a') ==>[alice,[alice,bobby,cindy,david,eliza,alice],[9,8,7,6,10],18.833333333333332]
  20. 20. © 2018. All Rights Reserved. Isolate for Analysis Clarify Results Validate Assumptions 1 23 Gremlin Debugging Cycle
  21. 21. © 2018. All Rights Reserved. Isolate a portion of the larger Gremlin statement for analysis. The isolated portion should be only as long as you are capable of following by just reading the code. Isolate for Analysis 1
  22. 22. © 2018. All Rights Reserved. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')) ==>v[0] Isolate a portion of the larger Gremlin statement for analysis. The isolated portion should be only as long as you are capable of following by just reading the code. Isolate for Analysis 1
  23. 23. © 2018. All Rights Reserved. Be clear on the result being returned from the current portion of Gremlin under review as it will feed into the following steps that will later be analyzed. Clarify Results 2
  24. 24. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> valueMap() ==>[name:[alice]] Be clear on the result being returned from the current portion of Gremlin under review as it will feed into the following steps that will later be analyzed. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Clarify Results 2
  25. 25. © 2018. All Rights Reserved. If using DataStax Studio, consider converting results containing vertices to incident edges, using inE(), outE() or bothE(), which will allow for a graph visualization of the result. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Clarify Results 2
  26. 26. © 2018. All Rights Reserved. Validate any assumptions regarding the nature of the traversal. Consider the traversal path, side-effects, the graph schema or anything that might be hidden by a naive view of the result. Validate Assumptions 3
  27. 27. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> path().next() ==>v[0] ==>e[10][0-rates->2] ==>v[2] ==>e[11][2-rates->4] ==>v[4] ==>e[12][4-rates->6] ==>v[6] ==>e[13][6-rates->8] ==>v[8] ==>e[14][8-rates->0] ==>v[0] Validate any assumptions regarding the nature of the traversal. Consider the traversal path, the graph schema or anything that might be hidden by a naive view of the result. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Validate Assumptions 3
  28. 28. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> select('v') ==>[v[0],v[2],v[4],v[6],v[8],v[0]] Step labels - created with as()-step - are a type of side-effect. Have clarity in terms of what they contain. Use select()-step to extract their contents. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Validate Assumptions 3
  29. 29. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Use path() to inspect what has been traversed ● Inspect step labels with select() Clarify Results 2
  30. 30. © 2018. All Rights Reserved. Slowly expand the portion of Gremlin being analyzed to the next point of isolation and repeat the debugging cycle. Isolate for Analysis 1
  31. 31. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name') ==>v[0] Slowly expand the portion of Gremlin being analyzed to the next point of isolation and repeat the debugging cycle. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  32. 32. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2
  33. 33. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> cap('a') ==>[alice] As with step labels, verify the contents of side- effects. In this case, determine what is being stored in the "a" side-effect by using cap()- step. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Examine Side-effects 3
  34. 34. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()) ==>v[0] Slowly expand the portion of Gremlin being analyzed to the next point of isolation and repeat the debugging cycle. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  35. 35. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> cap('a') ==>[alice,[alice,bobby,cindy,david,eliza,alice]] As with step labels, verify the contents of side- effects. In this case, determine what is being stored in the "a" side-effect by using cap()- step. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Examine Side-effects 3
  36. 36. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis
  37. 37. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v')). ......7> cap('a') ==>[alice,[v[0],v[2],v[4],v[6],v[8],v[0]]] To gain clarity on anonymous traversals, isolate them for review as part of the debugging cycle. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  38. 38. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis
  39. 39. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> select(all, 'v') ==>[v[0],v[2],v[4],v[6],v[8],v[0]] gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> select(all, 'v'). ......4> unfold(). ......5> values('name') ==>alice ==>bobby ==>cindy ==>david ==>eliza ==>alice gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> select(all, 'v'). ......4> unfold(). ......5> values('name'). ......6> fold() ==>[alice,bobby,cindy,david,eliza,alice] For analysis consider temporarily promoting the anonymous traversal to the parent traversal as it might allow better focus on the actual result. In this case, the store() side- effects are not relevant to the specific analysis of the contents of the by() modulator. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Clarify Results 2A B C
  40. 40. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack(div). .....19> by(union(constant(1d), .....20> tail(local, 1)).sum()). .....21> sack(mult). .....22> by(limit(local, 1)). .....23> sack().sum()). .....24> cap('a') ==>[alice,[alice,bobby,cindy,david,eliza,alice],[9,8,7,6,10],18.833333333333332]
  41. 41. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold())). .....15> cap('a').next() ==>alice ==>[alice,bobby,cindy,david,eliza,alice] ==>[9,8,7,6,10] ==>[[9,0],[8,1],[7,2],[6,3],[10,4]] It may be necessary to isolate anonymous traversals if they are especially complex. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  42. 42. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis ● Isolation within an anonymous traversal can produce misleading intermediate results - including the end portion of a traversal may be helpful
  43. 43. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold()). .....16> cap('a').next() ==>alice ==>[alice,bobby,cindy,david,eliza,alice] ==>[9,8,7,6,10] ==>[9,0] Analyzing portions of a traversal can produce misleading intermediate results that might leave an incorrect impression as to the overall intention of the traversal when executed as a whole. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  44. 44. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d))). .....18> cap('a').next() ==>alice ==>[alice,bobby,cindy,david,eliza,alice] ==>[9,8,7,6,10] ==>[9,0] Analyzing portions of a traversal can produce misleading intermediate results that might leave an incorrect impression as to the overall intention of the traversal when executed as a whole. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  45. 45. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack().sum()). .....19> cap('a').next() ==>alice ==>[alice,bobby,cindy,david,eliza,alice] ==>[9,8,7,6,10] ==>5.0 Include context from the end of a portion of a traversal to help ensure the intent of the overall traversal is consistent. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1
  46. 46. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● fold() can be a good temporary replacement for other reducing steps like sum(), max(), min(), etc. as it provides visibility to the values of that computation ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis ● Isolation within an anonymous traversal can produce misleading intermediate results - including the end portion of a traversal may be helpful
  47. 47. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack().fold()). .....19> cap('a').next() ==>alice ==>[alice,bobby,cindy,david,eliza,alice] ==>[9,8,7,6,10] ==>[1.0,1.0,1.0,1.0,1.0] Do not allow the actual steps in the traversal to prevent visibility into the nature of the computations that Gremlin is performing. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Clarify Results 2
  48. 48. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● fold() can be a good temporary replacement for other reducing steps like sum(), max(), min(), etc. as it provides visibility to the values of that computation ● Isolate results of interest ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis ● Isolation within an anonymous traversal can produce misleading intermediate results - including the end portion of a traversal may be helpful
  49. 49. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack(div). .....19> by(union(constant(1d), .....20> tail(local, 1)).sum()). .....21> sack().fold()). .....22> cap('a').unfold().skip(3).fold().next() ==>[1.0,0.5,0.3333333333333333,0.25,0.2] Limit output as necessary to focus on the aspects of the results that are most under consideration. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Clarify Results 2
  50. 50. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● fold() can be a good temporary replacement for other reducing steps like sum(), max(), min(), etc. as it provides visibility to the values of that computation ● Isolate results of interest ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis ● Isolation within an anonymous traversal can produce misleading intermediate results - including the end portion of a traversal may be helpful ● Simulate a traversal from a specific known input with the inject() start step
  51. 51. © 2018. All Rights Reserved. gremlin> g.inject([9,0],[8,1],[7,2],[6,3],[10,4]). ......1> sack(assign). ......2> by(constant(1d)). ......3> sack(div). ......4> by(union(constant(1d),tail(local,1)).sum()). ......5> sack() ==>1.0 ==>0.5 ==>0.3333333333333333 ==>0.25 ==>0.2 gremlin> g.inject([10,4]). ......1> union(constant(1d),tail(local,1)) ==>1.0 ==>4 gremlin> g.inject([10,4]). ......1> union(constant(1d),tail(local,1)). ......2> sum() ==>5.0 gremlin> 1.0d / g.inject([10,4]). ......1> union(constant(1d),tail(local,1)). ......2> sum().next() ==>0.2 Lost? Reclaim focus by simulating just the portion of the traversal that is not yet understood. g.V().has('name','alice').as('v'). repeat(outE().as('e').inV().as('v')). until(has('name','alice')). store('a'). by('name'). store('a'). by(select(all, 'v').unfold().values('name').fold()). store('a'). by(select(all, 'e').unfold(). store('x'). by(union(values('value'), select('x').count(local)).fold()). cap('x'). store('a'). by(unfold().limit(local, 1).fold()). unfold(). sack(assign). by(constant(1d)). sack(div). by(union(constant(1d), tail(local, 1)).sum()). sack(mult). by(limit(local, 1)). sack().sum()). cap('a') Isolate for Analysis 1 B A C D
  52. 52. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack(div). .....19> by(union(constant(1d), .....20> tail(local, 1)).sum()). .....21> sack(mult). .....22> by(limit(local, 1)). .....23> sack().sum()). .....24> cap('a') ==>[alice,[alice,bobby,cindy,david,eliza,alice],[9,8,7,6,10],18.833333333333332]
  53. 53. © 2018. All Rights Reserved. gremlin> g.V().has('name','alice').as('v'). ......1> repeat(outE().as('e').inV().as('v')). ......2> until(has('name','alice')). ......3> store('a'). ......4> by('name'). ......5> store('a'). ......6> by(select(all, 'v').unfold().values('name').fold()). ......7> store('a'). ......8> by(select(all, 'e').unfold(). ......9> store('x'). .....10> by(union(values('value'), .....11> select('x').count(local)).fold()). .....12> cap('x'). .....13> store('a'). .....14> by(unfold().limit(local, 1).fold()). .....15> unfold(). .....16> sack(assign). .....17> by(constant(1d)). .....18> sack(div). .....19> by(union(constant(1d), .....20> tail(local, 1)).sum()). .....21> sack(mult). .....22> by(limit(local, 1)). .....23> sack().sum()). .....24> cap('a') ==>[alice,[alice,bobby,cindy,david,eliza,alice],[9,8,7,6,10],18.833333333333332] Overriding traversers with by()- modulators Creating indexed values Using sack() for computations Result construction with side-effects Methods of cycle-detection Traversal patterns identified as a result of Gremlin Debugging Cycle Step label access and collection manipulation
  54. 54. © 2018. All Rights Reserved. Isolate for Analysis 1 Validate Assumptions 3 ● Use values() or valueMap() for vertex/edge results ● Visualize results ● Consider temporarily promoting an anonymous traversal to its parent if it yields better clarity into the result ● fold() can be a good temporary replacement for other reducing steps like sum(), max(), min(), etc. as it provides visibility to the values of that computation ● Isolate results of interest ● Use path() to inspect what has been traversed ● Inspect step labels with select() ● Inspect contents of side- effects steps with cap() Clarify Results 2 ● Large or complex anonymous traversals may require their own isolated analysis ● Isolation within an anonymous traversal can produce misleading intermediate results - including the end portion of a traversal may be helpful ● Simulate a traversal from a specific known input with the inject() start step
  55. 55. © 2018. All Rights Reserved. What's next?
  56. 56. © 2017. All Rights Reserved. ● Apache TinkerPop Website ○ http://tinkerpop.apache.org/ ● Gremlin Recipes ○ http://tinkerpop.apache.org/docs/current/recipes/ ● The Gremlin Compendium ○ http://www.doanduyhai.com/blog/?p=13460 ● Practical Gremlin ○ http://kelvinlawrence.net/book/Gremlin-Graph-Guide.html ● Gremlin DSLs - Part I ○ https://www.datastax.com/dev/blog/gremlin-dsls-in-java-with-dse-graph ● Gremlin DSLs - Part II ○ https://academy.datastax.com/content/gremlin-dsls-python-dse-graph Resources

×