A gremlin in my graph confoo 2014


Published on

Neo4j comes with enhanced connectivity of data and whiteboard friendly paradigm. It also brings a gremlin in your code : one of the supported graph query language brings a refreshing look at how one can search for data in a vast and interconnect web of data. Gremlin provides an abstract layer that make it easy to express your business logic without fighting with the code. It may even change your mind on object oriented programming.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A gremlin in my graph confoo 2014

  1. 1. A gremlin in your graph in your graph Montreal, Québec, Canada, February, 28th 2014
  2. 2. What is gremlin? G : graph, or the dataset V : Vertices, or nodes or objects E : Edges, or links or relations
  3. 3. graph database, http://www.neo4j.org/
  4. 4. Speaker Damien Seguy dams@php.net Exakat : Expert services in PHP
  5. 5. Yes we take questions ?
  6. 6. Graph Database http://www.neo4j.org/ http://localhost:7474/ Console -> Gremlin ==> ==> ,,,/ ==> (o o) ==> -----oOOo-(_)-oOOo----==> ==> Available variables: ==> g = (neo4jgraph[EmbeddedGraphDatabase [data/graph.db]] ==> , null) out = (java.io.PrintStream@398a3257 ==> , null) gremlin> Console in web browser!
  7. 7. Welcome to the graph g is the graph where the nodes live v represents all the vertices g.v(1) => Nodes always have an id v(1)
  8. 8. Properties Graph is schemaless g.v(1).id => 1 g.v(1).name => ext/datetime g.v(1).version => null
  9. 9. Node discovery g.v(2).map => {name=timezonedb, version=2013.9} map is convenient to discover the graph
  10. 10. In the graph Only objects and relations Vertices have id and properties Edges have id, label and properties
  11. 11. Edges g.v(1).in => v(4) g.v(1).out => v(2) v(3) g.v(1).both => v(2) v(3) v(4) Directed graph
  12. 12. g.v(1).inE.id => 2 3 4 8 g.v(1).inE.label => WROTE WROTE WROTE HAS
  13. 13. PECL database database of PHP extension authors. The extensions are stored in categories http://pecl.php.net/ Gephi
  14. 14. Following edges g.v(1).in(‘WROTE’).name => Derick Rethans Hannes Magnusson Jeremy Mikola g.v(1).in(‘WROTE’,‘HAS’).name => /*same as previous plus */ DB
  15. 15. Collaborators g.v(2).out(‘WROTE’).in(‘WROTE’).name => Hannes Magnusson Jeremy Mikola Derick Rethans
  16. 16. Collaborators g.v(2).out(‘WROTE’).in(‘WROTE’). except(g.v(2)).name => Hannes Magnusson Jeremy Mikola
  17. 17. Intro recap nodes and vertices : basic blocs in and out (and both) : navigation except(), in(‘label’) : filtering
  18. 18. Traversing the graph Means reading information in the graph Traversing involves listing nodes then following links until all conditions are met The graph contains Vertices and Edges. Is there anything else ?
  19. 19. PECL database Authors Ext Categories
  20. 20. Count authors count() All vertices are created equal
  21. 21. Count contributors g.V.out(‘wrote’) .count() => 5 Too manys!
  22. 22. Arrays or pipes g.V.out(‘wrote’)[1] v(12) g.V.out(‘wrote’)[1..2] .name ext/xdebug ext/mongo
  23. 23. Count contributors g.V.in(‘wrote’) .unique() .count()==> 3
  24. 24. Gremlin functions Pipe level function in, out, unique, count, Node level function map, has, filter{} Value level {property}
  25. 25. Property level // making name UPPERCASE g.v(79).name.toUpperCase(); EXT/GEARMAN // size of the name’s string g.v(130).name.toList().size(); 13 // extracting words http://groovy.codehaus.o in a string Documentation g.v(146).transform{ it.name.tokenize();} [Johann-Peter, Hartmann]
  26. 26. Vertex level g.v(130).map; {name=Ben Ramsey} g.v(14).propertykeys name g.v(12).setProperty(‘ext_nb’, g(12).out(‘wrote’).count() );
  27. 27. Collaborating Adding collaborators to the graph g.addEdge( g.v(1), g.v(1).out(‘WROTE’). in(‘WROTE’). except(g.v(1)), ‘COLLABORATE’); ==> No signature of method: com.tinkerpop.gremlin.groovy.GremlinGro ovyPipeline.except() is applicable for argument types: (com.tinkerp op.blueprints.pgm.impls.neo4j.Neo4jVertex) values: [v[1]] ==> Possible solutions: except(java.util.Collection), select(), next(), reset(), cap(), toSet() Except() produces a pipe! g.addedge doesn’t accept it
  28. 28. Collaborating Adding collaborator g.addEdge( g.v(1), g.v(1).out(‘WROTE’). in(‘WROTE’). except(g.v(1)).next(), ‘COLLABORATE’); Adding collaborators g.v(1).out(‘WROTE’).in(‘WROTE’).except(g.v(1)).each{ g.addEdge(g.v(1), it, ‘COLLABORATE’) } wonderful world of closures
  29. 29. Working with pipes Pipes functions often offer possibility for closure Closure is between {} and uses ‘it’ as current node Closure often have a standard default behavior, so they are sometimes stealth
  30. 30. Filtering filter links by label (in/out/both(‘label’)) Filter node with has(‘property’, ‘value’) or hasNot(‘property’, null) Filter authors with 14 or more extensions g.V.in(‘WROTE’).filter{it.out(‘WROTE’).count() > 14} Ilia Alshanetsky, Wez Furlong, Sara Golemon Filter allows us to work within the pipe
  31. 31. Filtering List of contributors with more ext in two categories g.V.in(‘wrote’).filter{ it.out(‘wrote’).in(‘has’).unique().count() > 2 }.name Filter longer than query ?
  32. 32. GroupCount groupCount(m) Number of categories by author g.V.out(‘wrote’).in(‘has’).groupCount(m) Apparently counted but who is v(274) ? ==> v[380]=1= v[379]=1==> v[378]=2==> v[273]=2==> v[274]=2==> v[272]=2==> v[173]=3==> v[240]=2==>
  33. 33. GroupCount g.V.out(‘has’) .in(‘wrote’) .groupCount(m) Georg Richter =2==> Warren Read =2==> Jay Kint =2==> shekhar euvel =1==>{it.name} Grant Croker =1
  34. 34. GroupCount Count of categories, without PHP standard distribution The second closure counts elements (default +1) g.V.has(‘wrote’) .in(‘has’) .groupCount(m) {it.name} {if (!it.name in [‘mysql’, ‘timezonedb’,’gd’, ‘dbase’ /* ....*/]) { it.b + 1;} }
  35. 35. Pipes array notation closure usage useful function for pipes : groupcount, groupby{key}{value}{mapreduce}, ordermap, [n..m] operator, More on http://gremlindocs.com/
  36. 36. Graph modifications Gremlin allows graph modification Adding a type property to authors g.V.in(‘WROTE’).each{ it.setProperty(‘type’, ‘author’); }
  37. 37. Updating on the way sideEffect runs a closure, but keep running the query Adding type to extensions AND categories in the same query g.V.as(‘ext’) .in(‘HAS’) .sideEffect{ it.setProperty(‘type’, ‘Category’); .back(‘ext’) .sideEffect{ it.setProperty(‘type’, ‘Extension’); } }
  38. 38. back tracking back( ‘name’ ) : goes back to the vertex or edge that was named with as(‘name’) back( n ) : goes back n vertex or edges behind Make is possible to check a branch, come back and check another branch
  39. 39. Manipulating vertex g.addVertex(id or null, [property:value,...]); g.addEdge(origin vertex, destination vertex, label, [property:value...]); g.removeVertex(vertex); g.removeEdge(edge);
  40. 40. Application to OOP? Gremlin goes beyond class specifics g.v(1).out(‘WROTE’).in(‘HAS’).unique().count() $total = array(); $author = new Author(1);foreach($author>getExtensions() as $ext) { $total[$ext->getCategory()] = true;} return count($total); return count($total); Gremlin generalizes the navigation
  41. 41. Thanks Dams@php.net http://www.slideshare.net/dseguy/ on the http://confoo.ca/
  42. 42. Kevin Bacon Suggest collaborators to authors ? Authors who worked with collaborators but not with the author, are recommendations g.V.has(‘name’, ‘contrib’).sideEffect{init = it}out(‘wrote’).in(‘wrote’).except(init).has(‘name’,‘con trib2’).path