WHO I AM●    Javier de la Rosa●     versae●     versae●    Computer Scientist and    Humanist●    CulturePlex Lab●     Cul...
FIRST OF ALL“You do not really understand something         unless you can explain it to your                           gr...
DATABASES (in the last 30 years)●    Data in tables, rows and columns●    Pretty basic mechanism to make connections:    –...
DATABASES (in the last 30 years)●    Rigid data schemas    –   Have you ever tried to make a schema migration?●    Relatio...
NoSQL, Not Only SQL●    Document                                      ●                                                   ...
DATABASES LANDSCAPE                                          Source: 451Research, https://451research.com/report-long?icid...
WHO IS USING GRAPHS?●    Mozilla with Pancake and Pacer    –   https://wiki.mozilla.org/Pancake &        http://pangloss.g...
WHY GRAPHS?●    Data is getting more and more connected    –   From text documents, to wikis, to ontologies, to        fol...
A FEW OF THE CURRENT USES●    Social Networking and Recommendations●    Network and Cloud Management●    Master Data Manag...
AND WHY ELSE?●    Because graphs are cool!                               Leonard Euler              Graph Databases in Pyt...
WHAT IS A GRAPH?●    G = (V, E)    Where    –   G is a graph    –   V is a set of vertices    –   E is a set of edges     ...
WHAT IS A GRAPH?●    G = (V, E)    –   Graph, aka network, diagram, etc.    –   Vertex, aka point, dot, node, element, etc...
TYPES OF GRAPHUndirected                                            Digraph                                             So...
TYPES OF GRAPHMultigraph                                      Hypergraph                                             Sourc...
SOME GRAPHS EVEN HAVE A NAME●    Complete graphs          K3                                  K5                          ...
SOME GRAPHS EVEN HAVE A NAME●    Stars            The star graphs S3, S4, S5 and S6                                       ...
SOME GRAPHS EVEN HAVE A NAME●    Snarks    Blanuša (second)                  Szekeres                                 Doub...
THINGS CAN COMPLICATE...       Local McLaughlin graph                                          Source: Wikipedia, http://e...
WAIT A SEC,Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012   20
DONT WORRY●    Just one more type: the Property Graph                                              1                      ...
THE PROPERTY GRAPH●    Directed, attributed and multi-relational                                         Name: Javi       ...
THE PROPERTY GRAPH●    A set of nodes, and each node has:    –   An unique identifier.    –   A set of outgoing edges.    ...
IN SHORT●    A Property Graph is composed by:    –   A set of nodes    –   A set of relationships    –   Properties and id...
GRAPH DATABASES●    A graph database uses graph structures with nodes,    edges, and properties to represent and store dat...
HOW IT LOOKS IN PYTHON?  Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012   26
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")              Graph Databases ...
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")                              ...
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(na...
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(na...
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(na...
HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(na...
HOW IT LOOKS IN PYTHON?          punches                           Name: ArnoldName: Silvester                    Graph Da...
HOW IT LOOKS IN PYTHON?  >>> chuck = g.nodes.create(name="Chuck")          punches                           Name: ArnoldN...
HOW IT LOOKS IN PYTHON?  >>> chuck = g.nodes.create(name="Chuck")          punches                           Name: ArnoldN...
HOW IT LOOKS IN PYTHON?  >>> chuck.dropkicks(silvester)  >>> chuck.dropkicks(arnold)          punches                     ...
HOW IT LOOKS IN PYTHON?  >>> chuck.dropkicks(silvester)  >>> chuck.dropkicks(arnold)          punches                     ...
GRAPH DATABASES LANDSCAPE  Database       Data Model           Query Method                 License                    Pyt...
GRAPH DATABASES LANDSCAPEAnd more:–   AffinityDB–   YarcData uRiKA–   Apache Giraph–   Cassovary–   StigDB–   NuvolaBase– ...
GRAPH DATABASES LANDSCAPE  Database       Data Model           Query Method                 License                    Pyt...
GREMLIN, BLUEPRINTS, WAT?Let me introduce you the TinkerPop Stack                                                         ...
BLUEPRINTS AND REXSTER●    Blueprints is a property graph model interface●    Rexster is a server that exposes any Bluepri...
AND WHAT ABOUT PYTHON?●    Options to connect to a Blueprints Graph Database    OrientDB         Neo4j                    ...
BULBFLOW●    Create    >>> alice = g.vertices.create(name="Alice")    >>> bob = g.vertices.create(name="Bob")    >>> g.edg...
PYBLUEPRINTS●    Create    >>>   alice = g.addVertex()    >>>   alice.setProperty("name", "Alice")    >>>   bob = g.addVer...
BUT NEO4J HAS ITS OWN CLIENTS!●    REST Clients for Neo4j                                                                 ...
HOW CAN I LOOKUP?●    An index is a data structure that supports the fast    lookup of elements by some key/value pair    ...
INDICES●    In Python bindings, are similar to dict    –   bulbflow    # bulbflow creates auto indices to make easier basi...
INDICES●    Some Graph Databases provide full-text queries    –   bulbflow    >>> nodes = g.vertices.index.query(name="ali...
...MORE COMPLEX SEARCHS?“Without traversals [FlockDB] is only a persistedgraph. But not a graph database.”                ...
LETS TRAVERSE THE GRAPH!●    “A graph traversal is the problem of visiting all the    nodes in a graph in a particular man...
NEO4J TRAVERSAL API●    Python-embedded (native Neo4j Python binding)    >>> traverser = gdb.traversal()                  ...
BLUEPRINTS GREMLIN●    Gremlin is a domain specific language for traversing    property graphs    –   Defines how to do a ...
NEO4J CYPHER QUERY LANGUAGE●    Declarative graph query language    –   Expressive and efficient querying    –   Focused o...
NEO4J CYPHER QUERY LANGUAGE●    Declarative graph query language    –   Expressive and efficient querying    –   Focused o...
NEO4J CYPHER QUERY LANGUAGE●    Declarative graph query language    –   Expressive and efficient querying    –   Focused o...
PY2NEO CYPHER HELPERS●    Get or create elements    >>> g.get_or_create_relationships(    ...:    (bob, "WORKS WITH", caro...
NEO4J-REST-CLIENT CYPHER HELPERS●    Query casting    >>> q = """start n=node(*) match n-[r:punchs]-() """             """...
LETS PLAY!●    Deploy Neo4j in Heroku or Amazon●    Use one of the available clients               Graph Databases in Pyth...
NEO4J HEROKU ADD-ON●    Create a Heroku app and add the Neo4j add-on    $   heroku apps:create pyconca    $   heroku addon...
NEO4J HEROKU ADD-ON●    Run IPython and thats it!    >>>   import os    >>>   NEO4J_URL = os.environ["NEO4J_URL"]    >>>  ...
NEO4J HEROKU ADD-ON●    Run IPython and thats it!    >>>   import os    >>>   NEO4J_URL = os.environ["NEO4J_URL"]    >>>  ...
THANKS!     Questions?    Javier de la Rosa          @versae     The CulturePlex LabWestern University, London, ON    PyCo...
APPENDIX: DATA MODELS●    neo4django    –   https://github.com/scholrly/neo4django●    neomodel    –   https://github.com/...
APPENDIX: VISUALIZE YOUR GRAPH●    Export somehow to .gexf for Gephi    –   http://gephi.org/●    Use D3.js    –   http://...
Graph Databases in Python (PyCon Canada 2012)
Upcoming SlideShare
Loading in...5
×

Graph Databases in Python (PyCon Canada 2012)

12,877

Published on

Since the irruption in the market of the NoSQL concept, graph databases have been traditionally designed to be used with Java or C. With some honorable exceptions, there isn't an easy way to manage graph databases from Python. In this talk, I will introduce you some of the tools that you can use today in order to work with those new challenging databases, from our favorite languge, Python.

Published in: Technology

Graph Databases in Python (PyCon Canada 2012)

  1. 1. GRAPH DATABASES IN PYTHON Javier de la Rosa @versae The CulturePlex Lab Western University, London, ON PyCon Canada 2012
  2. 2. WHO I AM● Javier de la Rosa● versae● versae● Computer Scientist and Humanist● CulturePlex Lab● CulturePlex Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 2
  3. 3. FIRST OF ALL“You do not really understand something unless you can explain it to your grandmother” – (Frequently attributed to) Richard Feynman Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 3
  4. 4. DATABASES (in the last 30 years)● Data in tables, rows and columns● Pretty basic mechanism to make connections: – Primary keys, Foreign keys, and... thats all● Relational, ahem, really? Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 4
  5. 5. DATABASES (in the last 30 years)● Rigid data schemas – Have you ever tried to make a schema migration?● Relational Algebra and SQL – Terrible for highly interconnected data – JOINs can take a life to end (a bit overdramatized) Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 5
  6. 6. NoSQL, Not Only SQL● Document ● Anaylitc – MongoDB, CouchDB, etc. – Hadoop● Key-value stores ● Graph – Redis, Riak, Voldemort, – Neo4j, OrientDB, Dynamo, etc. HyperGraphDB, Titan, etc.● Big Tables ● Other – Cassandra, Hbase, etc – Objectivity/DB, ZODB, etc. Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 6
  7. 7. DATABASES LANDSCAPE Source: 451Research, https://451research.com/report-long?icid=2289Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 7
  8. 8. WHO IS USING GRAPHS?● Mozilla with Pancake and Pacer – https://wiki.mozilla.org/Pancake & http://pangloss.github.com/pacer/● Twitter with FlockDB – https://github.com/twitter/flockdb● Facebook with Open Graph – https://developers.facebook.com/docs/opengraph/● Google with Knowledge Graph – http://www.google.ca/insidesearch/.../knowledge.html Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 8
  9. 9. WHY GRAPHS?● Data is getting more and more connected – From text documents, to wikis, to ontologies, to folksonomies, etc● And more semi-structured – Think about the decentralization of content generation● And more complex – Social networks, semantic trending, etc Source: Neo Technology, http://www.slideshare.net/emileifrem/neo4j-the-benefits-of-graph-databases-oscon-2009 Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 9
  10. 10. A FEW OF THE CURRENT USES● Social Networking and Recommendations● Network and Cloud Management● Master Data Management● Geospatial● Bioinformatics● Content Management and Security and Access Control Source: Mashable, http://mashable.com/2012/09/26/graph-databases/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 10
  11. 11. AND WHY ELSE?● Because graphs are cool! Leonard Euler Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 11
  12. 12. WHAT IS A GRAPH?● G = (V, E) Where – G is a graph – V is a set of vertices – E is a set of edges Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_(mathematics) Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 12
  13. 13. WHAT IS A GRAPH?● G = (V, E) – Graph, aka network, diagram, etc. – Vertex, aka point, dot, node, element, etc. – Edge, aka relationship, arc, line, link, etc.● Basically, “a graph states that something is related to something else” – Svetlana Sicular, Research Director at Gartner Source: Gartner, http://blogs.gartner.com/svetlana-sicular/think-graph/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 13
  14. 14. TYPES OF GRAPHUndirected Digraph Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_(mathematics) Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 14
  15. 15. TYPES OF GRAPHMultigraph Hypergraph Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_(mathematics) Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 15
  16. 16. SOME GRAPHS EVEN HAVE A NAME● Complete graphs K3 K5 K8 Source: Wikipedia, http://en.wikipedia.org/wiki/Gallery_of_named_graphs Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 16
  17. 17. SOME GRAPHS EVEN HAVE A NAME● Stars The star graphs S3, S4, S5 and S6 Source: Wikipedia, http://en.wikipedia.org/wiki/Gallery_of_named_graphs Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 17
  18. 18. SOME GRAPHS EVEN HAVE A NAME● Snarks Blanuša (second) Szekeres Double star Source: Wikipedia, http://en.wikipedia.org/wiki/Gallery_of_named_graphs Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 18
  19. 19. THINGS CAN COMPLICATE... Local McLaughlin graph Source: Wikipedia, http://en.wikipedia.org/wiki/Gallery_of_named_graphs Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 19
  20. 20. WAIT A SEC,Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 20
  21. 21. DONT WORRY● Just one more type: the Property Graph 1 2 1 2 3 3 4 4 Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 21
  22. 22. THE PROPERTY GRAPH● Directed, attributed and multi-relational Name: Javi 1 2 1 Knows Knows Since: 2009 Since:1990 2 3 3 Name: David Likes Name: John 4 Likes 4 Title: The Art of Computer Programming Price: $135 Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 22
  23. 23. THE PROPERTY GRAPH● A set of nodes, and each node has: – An unique identifier. – A set of outgoing edges. – A set of incoming edges. – A collection of properties defined by a map from key to value.● A set of relationships, and each relationship has: – An unique identifier. – An outgoing tail vertex. – An incoming head vertex. – And a collection of properties defined by a map from key to value. Source: TinkerPop, https://github.com/tinkerpop/gremlin/wiki/Defining-a-Property-Graph Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 23
  24. 24. IN SHORT● A Property Graph is composed by: – A set of nodes – A set of relationships – Properties and ids on both● Sometimes, nodes and relationship can be typed – In Blueprints and Neo4j, a label denotes the type of relationship between its two nodes. Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 24
  25. 25. GRAPH DATABASES● A graph database uses graph structures with nodes, edges, and properties to represent and store data – ...but there is not an easy way to visualize this Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 25
  26. 26. HOW IT LOOKS IN PYTHON? Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 26
  27. 27. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester") Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 27
  28. 28. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester") Name: Silvester Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 28
  29. 29. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(name="Arnold") Name: Silvester Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 29
  30. 30. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(name="Arnold") Name: Silvester Name: Arnold Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 30
  31. 31. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(name="Arnold")>>> punch = arnold.punches(silvester) Name: Silvester Name: Arnold Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 31
  32. 32. HOW IT LOOKS IN PYTHON?# Lets create a graph>>> silvester = g.nodes.create(name="Silvester")>>> arnold = g.nodes.create(name="Arnold")>>> punch = arnold.punches(silvester) punches Name: Silvester Name: Arnold Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 32
  33. 33. HOW IT LOOKS IN PYTHON? punches Name: ArnoldName: Silvester Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 33
  34. 34. HOW IT LOOKS IN PYTHON? >>> chuck = g.nodes.create(name="Chuck") punches Name: ArnoldName: Silvester Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 34
  35. 35. HOW IT LOOKS IN PYTHON? >>> chuck = g.nodes.create(name="Chuck") punches Name: ArnoldName: Silvester Name: Chuck Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 35
  36. 36. HOW IT LOOKS IN PYTHON? >>> chuck.dropkicks(silvester) >>> chuck.dropkicks(arnold) punches Name: ArnoldName: Silvester Name: Chuck Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 36
  37. 37. HOW IT LOOKS IN PYTHON? >>> chuck.dropkicks(silvester) >>> chuck.dropkicks(arnold) punches dropkicks Name: Arnold dropkicksName: Silvester Name: Chuck Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 37
  38. 38. GRAPH DATABASES LANDSCAPE Database Data Model Query Method License Python Binding Cypher, Gremlin, Native, Neo4j Property Graph GPL, AGPL Traversal Blueprints, REST Gremlin, OrientDB Property Graph Apache 2 Blueprints Traversal Typed HGQuery,HyperGraphDB LGPL Nope Hypergraph Traversal DEX Property Graph Traversal Commercial Blueprints Titan Property Graph Gremlin Apache 2 Blueprints AGPL, InfoGrid Property Graph Traversal Nope CommercialInfiniteGraph Property Graph Gremlin Commercial Nope Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 40
  39. 39. GRAPH DATABASES LANDSCAPEAnd more:– AffinityDB– YarcData uRiKA– Apache Giraph– Cassovary– StigDB– NuvolaBase– Pegasus– Microsoft Trinity– Sherlock– And so on Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 41
  40. 40. GRAPH DATABASES LANDSCAPE Database Data Model Query Method License Python Binding Cypher, Gremlin, Native, Neo4j Property Graph GPL, AGPL Traversal Blueprints, REST Gremlin, OrientDB Property Graph Apache 2 Blueprints Traversal Typed HGQuery,HyperGraphDB LGPL Nope Hypergraph Traversal DEX Property Graph Traversal Commercial Blueprints Titan Property Graph Gremlin Apache 2 Blueprints AGPL, InfoGrid Property Graph Traversal Nope CommercialInfiniteGraph Property Graph Gremlin Commercial Nope Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 42
  41. 41. GREMLIN, BLUEPRINTS, WAT?Let me introduce you the TinkerPop Stack Source:TinkerPop, http://www.tinkerpop.com/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 43
  42. 42. BLUEPRINTS AND REXSTER● Blueprints is a property graph model interface● Rexster is a server that exposes any Blueprints graph through REST Source:TinkerPop, http://www.tinkerpop.com/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 44
  43. 43. AND WHAT ABOUT PYTHON?● Options to connect to a Blueprints Graph Database OrientDB Neo4j bulbflow Blueprints API Rexster python-blueprints pyblueprints DEX Titan REST Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 45
  44. 44. BULBFLOW● Create >>> alice = g.vertices.create(name="Alice") >>> bob = g.vertices.create(name="Bob") >>> g.edges.create(alice, "knows", bob)● Get >>> alice = g.vertices.get(1) >>> bob = g.vertices.get(2)● Update >>> alice.age = 21 >>> alice.save()● Delete >>> alice.delete() Source: Bulbflow, http://bulbflow.com/docs/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 46
  45. 45. PYBLUEPRINTS● Create >>> alice = g.addVertex() >>> alice.setProperty("name", "Alice") >>> bob = g.addVertex() >>> bob.setProperty("name", "Bob") >>> g.addEdge(alice, bob, "knows")● Get >>> alice = g.getVertex(1) >>> bob = g.getVertex(2)● Update >>> alice.setProperty("age", 21)● Delete >>> g.removeVertex(alice.getId()) Source: PyBlueprints, https://github.com/escalant3/pyblueprints Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 47
  46. 46. BUT NEO4J HAS ITS OWN CLIENTS!● REST Clients for Neo4j neo4j-rest-client OrientDB Neo4j py2neo Blueprints API Rexster bulbflow python-blueprints DEX Titan pyblueprints REST Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 48
  47. 47. HOW CAN I LOOKUP?● An index is a data structure that supports the fast lookup of elements by some key/value pair Source: TinkerPop, https://github.com/tinkerpop/blueprints/wiki/Graph-Indices Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 49
  48. 48. INDICES● In Python bindings, are similar to dict – bulbflow # bulbflow creates auto indices to make easier basic lookups >>> nodes = g.vertices.index.lookup(name="Alice") >>> for node in nodes: ...: print vertex – PyBlueprints >>> index = g.getIndex("names", "vertex") >>> index.put("name", alice.getProperty("name"), alice) >>> nodes = index.get("name", "Alice") >>> for node in nodes: ...: print node Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 50
  49. 49. INDICES● Some Graph Databases provide full-text queries – bulbflow >>> nodes = g.vertices.index.query(name="ali*") >>> for node in nodes: ...: print node – PyBlueprints >>> index = g.getIndex("names", "vertex") >>> nodes = index.query("name", "ali*") >>> for node in nodes: ...: print node Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 51
  50. 50. ...MORE COMPLEX SEARCHS?“Without traversals [FlockDB] is only a persistedgraph. But not a graph database.” – Alex Popescu Source: myNoSQL, http://nosql.mypopescu.com/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 52
  51. 51. LETS TRAVERSE THE GRAPH!● “A graph traversal is the problem of visiting all the nodes in a graph in a particular manner” – A* search – Alpha-beta prunning – Breadth-First Search (BFS) – Depth-First Search (DFS) – Dijkstras algorithm – Floyd-Warshalls algortimth – Etc. Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_traversal Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 53
  52. 52. NEO4J TRAVERSAL API● Python-embedded (native Neo4j Python binding) >>> traverser = gdb.traversal() .relationships(knows).traverse(alice) # The graph is traversed as you loop through the result >>> for node in traverser.nodes: ...: print node● neo4j-rest-client >>> traverser = alice.traverse(types=[client.All.knows]) # The graph is traversed as you loop through the result >>> for node in traverser: ...: print node Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 54
  53. 53. BLUEPRINTS GREMLIN● Gremlin is a domain specific language for traversing property graphs – Defines how to do a query based on the graph structure >>> gremlin = g.extensions.GremlinPlugin.execute_script >>> params = {alice_id: alice.id} >>> script = "g.V(alice_id).out(knows)" >>> node = gremlin(script=script, params=params) >>> node == bob Source: TinkerPop Gremlin, https://github.com/tinkerpop/gremlin/wiki Source: Marko Rodríguez, The Graph Traversal Programmin Pattern, http://www.slideshare.net/slidarko/graph-windycitydb2010 Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 55
  54. 54. NEO4J CYPHER QUERY LANGUAGE● Declarative graph query language – Expressive and efficient querying – Focused on expressing what to retrieve from a graph – Inspired by SQL – Pattern matching expressions from SPARQL Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 56
  55. 55. NEO4J CYPHER QUERY LANGUAGE● Declarative graph query language – Expressive and efficient querying – Focused on expressing what to retrieve from a graph – Inspired by SQL – Pattern matching expressions from SPARQL 1 2 label (1) -[:label]- (2) Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 57
  56. 56. NEO4J CYPHER QUERY LANGUAGE● Declarative graph query language – Expressive and efficient querying – Focused on expressing what to retrieve from a graph – Inspired by SQL – Pattern matching expressions from SPARQL 1 2 label START n=(1), m=(2) MATCH n-[r:label]-m RETURN r Source: Wikipedia, https://en.wikipedia.org/wiki/Graph_database Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 58
  57. 57. PY2NEO CYPHER HELPERS● Get or create elements >>> g.get_or_create_relationships( ...: (bob, "WORKS WITH", carol, {"since": 2004}), ...: (alice, "DISLIKES!", carol, {"reason": "youth"}), ...: (bob, "WORKS WITH", dave, {"since": 2009}), )● Get counts >>> nodes_count = g.get_node_count() >>> rels_count = g.get_relationship_count()● Delete >>> g.delete() Source: py2neo, http://py2neo.org/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 59
  58. 58. NEO4J-REST-CLIENT CYPHER HELPERS● Query casting >>> q = """start n=node(*) match n-[r:punchs]-() """ """return n, n.name, r, r.since""" >>> results = g.query(q, returns=(Node, unicode, Relationship, int))● Complex filtering lookups = ( Q("name", exact="Arnold") & (Q("surname", istartswith="swar") & ~Q("surname", iendswith="chenegger")) ) arnolds = g.nodes.filter(lookups) Source: neo4j-rest-client, https://github.com/versae/neo4j-rest-client Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 60
  59. 59. LETS PLAY!● Deploy Neo4j in Heroku or Amazon● Use one of the available clients Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 61
  60. 60. NEO4J HEROKU ADD-ON● Create a Heroku app and add the Neo4j add-on $ heroku apps:create pyconca $ heroku addons:add neo4j --app pyconca $ xdg-open `heroku config:get NEO4J_URL --app pyconca` $ export NEO4J_URL=`heroku config:get NEO4J_URL --app pyconca`● Create a virtualenv with neo4j-rest-client $ mkvirtualenv --no-site-packages pyconca $ workon pyconca $ pip install ipython neo4jrestclient $ ipython Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 62
  61. 61. NEO4J HEROKU ADD-ON● Run IPython and thats it! >>> import os >>> NEO4J_URL = os.environ["NEO4J_URL"] >>> from neo4jrestclient import client >>> gdb = client.GraphDatabase(NEO4J_URL + "/db/data") >>> gdb.url Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 63
  62. 62. NEO4J HEROKU ADD-ON● Run IPython and thats it! >>> import os >>> NEO4J_URL = os.environ["NEO4J_URL"] >>> from neo4jrestclient import client >>> gdb = client.GraphDatabase(NEO4J_URL + "/db/data") >>> gdb.url Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 64
  63. 63. THANKS! Questions? Javier de la Rosa @versae The CulturePlex LabWestern University, London, ON PyCon Canada 2012
  64. 64. APPENDIX: DATA MODELS● neo4django – https://github.com/scholrly/neo4django● neomodel – https://github.com/robinedwards/neomodel● bulbflow models – http://bulbflow.com/quickstart/#models Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 66
  65. 65. APPENDIX: VISUALIZE YOUR GRAPH● Export somehow to .gexf for Gephi – http://gephi.org/● Use D3.js – http://d3js.org/● Use sigma.js – http://sigmajs.org/● Take a look on Max De Marzi work – http://maxdemarzi.com/category/visualization/● Use Sylva (for newbies) – http://www.sylvadb.com/ Graph Databases in Python, Javier de la Rosa, PyCon Canada, 2012 67

×