4.4. GRAPH
DATABASES
Basics
Graph databases model data in the form of nodes and relationships.
◦ Node: Have a set of attributes which represent different types of entities (person, place,
object, etc.)
◦ Relationships: represent in the form of links between the nodes, can be either directed
(unidirectional or bidirectional) or undirected.
Since the relationships between the entities are explicitly stored in the form of links,
querying for related entities in graph databases is much simpler and faster than
relational databases as the complex join operations are avoided.
Graph databases are suitable for applications in which the primary focus is on
querying for relationships between entities and analyzing the relationships.
2
Neo4j
◦ Neo4j adopts a graph model that consists of
nodes and relationships, both nodes and
relationships have properties which are
captured in the form of multiple attributes
(key-value pairs).
◦ Nodes are tagged with labels which are used
to represent different roles in the domain
being modeled.
Property graph model for an
eCommerce application
3
Neo4j
◦ Provides support for Atomicity, Consistency, Isolation, Durability (ACID).
◦ How to set up:
◦ #Download the stable release of Neo4j for Linux from http://neo4j.com/download/
◦ #Extract the archive tar -xf neo4j-*.tar.gz
◦ #Run neo4j
◦ NEO4J_HOME/bin/neo4j start
◦ #Visit http://localhost:7474 in your web browser.
4
Neo4j - Cypher
◦ For create, read, update and delete (CRUD) operations,
Neo4j provides a query language called Cypher. Cypher
has some similarities with the SQL query language used
for relational databases.
5
◦ Neo4j exposes a variety of *REST APIs for performing the CRUD
(create, read, update, delete) operations. These REST APIs enabled the
development of language-specific client libraries for Neo4j.
◦ Neo4j also provides a web interface from where you can execute
Cypher statements and view the graphs in the database.
*Note: Representational
state transfer (REST) is
a software architectural
style that defines a set
of constraints to be
used for creating Web
services. RESTful Web
services allow the
requesting systems to
access and manipulate
textual representations
of Web resources by
using a uniform and
predefined set of
stateless operations.
6
Summary
7

Graph database

  • 1.
  • 2.
    Basics Graph databases modeldata in the form of nodes and relationships. ◦ Node: Have a set of attributes which represent different types of entities (person, place, object, etc.) ◦ Relationships: represent in the form of links between the nodes, can be either directed (unidirectional or bidirectional) or undirected. Since the relationships between the entities are explicitly stored in the form of links, querying for related entities in graph databases is much simpler and faster than relational databases as the complex join operations are avoided. Graph databases are suitable for applications in which the primary focus is on querying for relationships between entities and analyzing the relationships. 2
  • 3.
    Neo4j ◦ Neo4j adoptsa graph model that consists of nodes and relationships, both nodes and relationships have properties which are captured in the form of multiple attributes (key-value pairs). ◦ Nodes are tagged with labels which are used to represent different roles in the domain being modeled. Property graph model for an eCommerce application 3
  • 4.
    Neo4j ◦ Provides supportfor Atomicity, Consistency, Isolation, Durability (ACID). ◦ How to set up: ◦ #Download the stable release of Neo4j for Linux from http://neo4j.com/download/ ◦ #Extract the archive tar -xf neo4j-*.tar.gz ◦ #Run neo4j ◦ NEO4J_HOME/bin/neo4j start ◦ #Visit http://localhost:7474 in your web browser. 4
  • 5.
    Neo4j - Cypher ◦For create, read, update and delete (CRUD) operations, Neo4j provides a query language called Cypher. Cypher has some similarities with the SQL query language used for relational databases. 5
  • 6.
    ◦ Neo4j exposesa variety of *REST APIs for performing the CRUD (create, read, update, delete) operations. These REST APIs enabled the development of language-specific client libraries for Neo4j. ◦ Neo4j also provides a web interface from where you can execute Cypher statements and view the graphs in the database. *Note: Representational state transfer (REST) is a software architectural style that defines a set of constraints to be used for creating Web services. RESTful Web services allow the requesting systems to access and manipulate textual representations of Web resources by using a uniform and predefined set of stateless operations. 6
  • 7.

Editor's Notes

  • #4 The example shows a labeled property graph model for an eCommerce application. In this graph, we have two types of nodes: Customer and Product. The Customer nodes have attributes such as customer name, address, city, country and zip code. The Product nodes have attributes such as product title, price color, size, weight, etc. There are two types of relationships between the customer and product nodes: Orders or Rates. The Order relationship between a customer and product has properties such as the order date and quantity. The Rates relationship between a customer and product has a single property to capture the customer rating.
  • #6 Box 4.8 shows examples of using Cypher for creating customer and product nodes and the relationships between the nodes.
  • #7 Box 4.9 shows an example of using the Py2neo client library. In this example, we first authenticate with a Neo4j server by providing the server hostname, username and password. Next, we create an instance of the Graph class which provides methods for creating nodes and relationships, searching nodes, executing Cypher queries and various other methods. To define a node, we create an instance of the Node class by providing the node label, name and properties. In this example, we define three nodes (c1,c2,c3) to represent three customers and two nodes (p1,p2) to represent two products. Next, we define the relationships between the nodes by creating instances of the Relationship class. For each relationship, we provide the related nodes, a label for the relationship and relationship properties. Finally, we use the create method of the Graph class to create the nodes and relationships. Figure 4.13 shows the graph created using the Python program in Box 4.9.
  • #8 Figure 4.14 provides a comparison of these four types of NoSQL databases. The key-value databases store data in the form of key-value pairs where the keys are used to identify uniquely the values stored. Hash functions are applied to the key to determine where the value should be stored. Document store databases store semi-structured data in the form of documents which are encoded in different standards such as JSON, XML, BSON or YAML. The benefit of using document databases over key-value databases is that these databases allow efficiently querying the documents based on the attribute values in the documents. Column family databases store data as columns where a column has a name and a value. Columns are grouped into column families and a collection of columns make up a row which is identified by a row-key. Column family databases support high-throughput reads and writes and have distributed and highly available architectures. Graph databases model data in the form of nodes and relationships. Nodes represent the entities in the data model and have a set of attributes. The relationships between the entities are represented in the form of links between the nodes.