The document discusses Neo4j, a native graph database. It begins by defining what a graph database is - a collection of nodes and edges where nodes represent entities and edges represent relationships between nodes. It then discusses Neo4j specifically, describing it as an open-source, native graph database that implements the property graph model at the storage level in a highly scalable and schema-free manner. Example use cases for Neo4j include fraud detection, social networks, and knowledge graphs.
2. What is graph database?
A graph database is
essentially a collection of
nodes and edges.
Each node represents an
entity (such as a person or
business) and each edge
represents a connection or
relationship between two
nodes.
4. ● Nodes (equivalent to vertices in graph theory).
○ These are the main data elements that are interconnected through relationships.
○ A node can have one or more labels (that describe its role) and properties (i.e. attributes).
● Relationships (equivalent to edges in graph theory).
○ A relationship connects two nodes that, in turn, can have multiple relationships.
○ Relationships can have one or more properties.
○ Relationships always have a direction
● Labels.
○ These are used to group nodes, and each node can be assigned multiple labels.
○ Labels are indexed to speed up finding nodes in a graph.
● Properties.
○ These are attributes of both nodes and relationships.
○ Neo4j allows for storing data as key-value pairs, which means properties can have any value
(string, number, or boolean).
Components
6. When to use graph database?
● Has many many-to-many relationships.
● High value of relationships. If the relationships between your data elements
are just as important or more important than the elements themselves, you
should consider using graph.
● Low latency at large scale.The ability of graph databases to navigate
through the relationships represented in large data sets more quickly than
other types of databases is what justifies this additional complexity
7. When not to use graph database
● Where data is disconnected and relationships do not matter.
● Where optimizing for writing and storing data and do not need to read or
query it.
● Where core data objects or data model stay consistent and data structure is
fixed and tabular.
● Where queries execute bulk data scans or do not start from a known data
point.
● Where you will use it as a key-value store.
● Where large amounts of text or BLOBS need to be stored as properties.
● Where you have to process high volumes of transactions
● Where you have to handle queries that span the entire database.
8. Neo4j
● Neo4j is an open-source, NoSQL, native graph database that provides an
ACID-compliant transactional backend for your applications.
● referred to as a native graph database because it efficiently implements the
property graph model down to the storage level.
● It is highly scalable and schema-free.
● It's world most popular graph database management system.
● Neo4j is implemented in Java language and it can be accessed by other
language using Cypher Query Language (CQL) through a transactional HTTP
endpoint.
● The database uses pointers to navigate and traverse the graph.
9. Advantages of neo4j
● Performance: performance remains high even if the amount of data grows
significantly.
● Flexibility: easily upgrade the data structure without damaging existing
functionality.
● Agility: The structure of a Neo4j database is easy-to-upgrade, so the data
store can evolve along with your application.
10. Use cases
● Fraud detection and analytics
● Network and database infrastructure monitoring
● Recommendation engines
● Social networks
● Knowledge graph
● Identity and access management
● Privacy and risk compliance
● Master data management
Over 300 commercial customers and over 750 startups use Neo4j.
Flagship customers include eBay, Walmart, Cisco, Citibank, ING, UBS, HP,
Microsoft, IBM, Thomson Reuters, Amadeus Travel, Caterpillar, Volvo and many
more.
11. Cypher – A Next-Generation Query Language
● Cypher was based on
the power of SQL, but
optimized specifically for
graphs.
● The syntax is concise
and straightforward,
allowing users to easily
write all the normal
CRUD operations in a
simple and maintainable
way.
12. Nodes: Cypher uses ASCII-Art to represent patterns. We surround nodes
with parentheses which look like circles (p:Person).
Relationships: Relationships are basically an arrow --> between two nodes.
Additional information can be placed in square brackets inside of the arrow.
● relationship-types like -[:KNOWS|:LIKE]->
● a variable name -[rel:KNOWS]-> before the colon
● additional properties -[{since:2010}]->
● structural information for paths of variable length -[:KNOWS*..4]->
13. Other NoSQL Databases Neo4j, Native Graph Database
Data Storage No support for connected data at the
database level.
Fast transaction in connected data
Data Modeling Data model not suitable for
enterprise architectures as wide
columns and document stores do
not offer control at the design level.
Flexible, "whiteboard-friendly" data model
allows for fine-grained control of data
architecture
Query Performance No graph processing capability for
data relationships
Native graph processing ensures zero
latency and real-time performance
Query Language No query constructs exist to express
data relationships.
Cypher: native graph query language
Comparison with other NoSql Database
14. Sample demo: Creating a music database that
contains band names and their albums.
Step 1. Create a node: Strapping Young Lad
The first band will be called Strapping Young Lad. So we will create an Artist node and
call it Strapping Young Lad.
CREATE (a:Artist { Name : "Strapping Young Lad" })
Artist: is label
Node: has property name, value of property: Strapping Young Lad
a: variable name, useful if we need to refer to it later in the statement
15. 1.1. Create and Display node
The CREATE statement creates the node but it doesn't display the node.
To display the node, you need to follow it up with a RETURN statement.
Let's create another node. This time it will be the name of an album. But this time we'll
follow it up with a RETURN statement.
CREATE (b:Album { Name : "Heavy as a Really Heavy Thing", Released : "1995" })
RETURN b
16. 1.2 Creating Multiple Nodes
You can create multiple nodes at once by separating each node with a comma:
CREATE (a:Album { Name: "Killers"}), (b:Album { Name: "Fear of the Dark"})
RETURN a,b
17. Step 2: Creating relationship between nodes
The statement for creating a relationship consists of CREATE, followed by the details of
the relationship that you're creating.
First, let's create a relationship between an artist and an album.
18. MATCH (a:Artist),(b:Album)
WHERE a.Name = "Strapping Young Lad" AND b.Name = "Heavy as a Really Heavy
Thing"
CREATE (a)-[r:RELEASED]->(b)
RETURN r
● MATCH statement : find the two nodes to create the relationship between
● There could be many nodes with an Artist or Album label so we narrow it down to
just those nodes we're interested in. In this case we use “Name”
● Relationship is created using CREATE. The relationship is established by using an
ASCII-code pattern, with an arrow indicating the direction of the relationship:
(a)-[r:RELEASED]->(b).
● We give the relationship a variable name of r and give the relationship a type of
RELEASED (as in "this band released this album"). The relationship's type is
analogous to a node's label.
19. Retrieve a node
MATCH (p:Person {Name: "Devin Townsend"})
RETURN p
is equivalent to
SELECT * FROM Person
WHERE name = "Devin Townsend";
20. Retrieving a relationship
If we wanted to find out which artist released the album called Heavy as a Really Heavy
Thing, we could use the following query:
MATCH (a:Artist)-[:RELEASED]->(b:Album)
WHERE b.Name = "Heavy as a Really Heavy Thing"
RETURN a
It matches all artists that released an album that had a name of Heavy as a Really Heavy
Thing
22. Delete
Delete a node
MATCH (a:Album {Name: "Killers"}) DELETE a
Delete all nodes
MATCH (n) DELETE n
Delete the whole database
MATCH (n) DETACH DELETE n
24. Fig: Table involved in project using RDBMS Fig: Neo4j equivalent
RDBMS Neo4j
25. 5.1 Recommendation system
For cafeteria rating, I have just implemented nationality as a recommendation
parameter.
Eg: If a student is Nepali, then the stalls visited by nepali people and have given
good rating will be recommended. A portion of this is implemented.
26. Displaying stalls visited by nepali people and have
given good rating
Query:
Match (cus:Customer)-[:Reviewed]->(s:Stall),
(s)-[:HasMenu]-(m:Menu),
(cus)-[:gaveReview]->(n:Review)
where cus.nationality='Nepali'
and
(n.taste+n.affordability+n.availability+n.behaviour+n.hygiene)/5
>3
return s,m,count(cus)as customernumber
order by customernumber desc
27. RDBMS
select s.stall_name, m.description from
stall s
Inner join review r
On s.st_id = r.stall_id
Inner join Givesreview gr
On r.r_id=g.r_id
Inner join customer c
On gr.c_id = c.c_id
Inner join menu m
on s.stall_id=m.stall_id
Where
(r.taste+r.affordability+r.availability+r.
.behaviour+r.hygiene)/5)>3
And c.nationality=’Nepalese’)
NEO4J
Match
(cus:Customer)-[:Reviewed]->(s:Stall),
(s)-[:HasMenu]-(m:Menu),
(cus)-[:gaveReview]->(n:Review)
with
s,m,cus,(n.taste+n.affordability+n.avail
ability+n.behaviour+n.hygiene)/5 as
reviewfinal
where cus.nationality='Nepali'
and reviewfinal>3
return s,m,count(cus)as
customernumber
order by customernumber desc
Displaying stalls visited by nepali people and have given good rating