Welcome
Everyone!
Introduction to
Neo4j
2022.5.17
Your Instructors:
Dan McNamara & Syd Beckett
To Do if Not Done Already:
Install Neo4j Desktop from
neo4j.com/download
Install Neo4j Aura from
https://neo4j.com/cloud/aura/
- If local desktop install is problematic -
Create Sandbox on neo4j.com/sandbox
Today’s Instructors
Dan McNamara
Syd Beckett
Solution Engineers
at Neo4j Inc.
dan.mcnamara@neo4j.com
syd.beckett@neo4j.com
2
3
Our Plan for Today
Agenda
• Neo4j Platform Overview
• Installation/Setup
• Intro to Cypher
✔ w/ Exercises
Objectives & Outcomes
• Install and run Neo4j locally
• Learn Cypher
✔ Creating/Updating Graphs
✔ Pattern Matching
✔ Aggregations
✔ Creating nodes/relationships
✔ Loading data from files
• Know where to go next
Breaks/Lunch
• 2 Breaks (15 minutes)
✔ 10:30ish
✔ 2:30ish
• Lunch
✔ 1 hour 12:00-1:00ish
4
Prep Items
To Do if Not Done Already:
Install Neo4j Desktop from neo4j.com/download
Install Neo4j Aura from https://neo4j.com/cloud/aura/
- If local desktop or Aura install is problematic -
• Create Sandbox on neo4j.com/sandbox
Helpful Links:
Neo4j Developer Materials: https://neo4j.com/developer/
Cypher RefCard: https://neo4j.com/docs/cypher-refcard/current/
Introduction to
Graph Theory
6
What’s the point of graphs?
A graph lets us model the real world to answer tough questions about how
things are connected, especially in ways that may not be obvious!
Seven Bridges of Konigsberg problem. Leonhard Euler, 1735
7
What is Graph Theory?
In mathematics, Graph Theory is the study of graphs, which are
mathematical structures used to model relationships between concepts.
More intuitively: Graph Theory is the study of relationships.
8
What is a Graph?
G = (V, E)
V: a set of vertices
E: a set of edges,
where an edge is a pair of vertices
V = {1,2,3,4,5,6}
E = { {1,2},{2,3},{2,6},{2,6},{3,5},{3,4},{5,4},{6,5} }
1 2
3
4
5
6
Traversal is the process of following a sequence of
edges that link adjacent vertices
9
What is Traversal?
1 2
3
4
5
6
w = ( {1,2},{2,3} )
Graphs Are Everywhere!
10
The Internet
H
O
H
Chemistry
ActiveDirectory
& LDAP
Public
Transit
& Supply
Chains
Social Networks
Why Graph?
11
12
Harnessing Connections Drives Business Value
Enhanced Decision
Making
Hyper
Personalization
Massive Data
Integration
Data Driven
Discovery & Innovation
Product Recommendations
Personalized Health Care
Media and Advertising
Fraud Prevention
Network Analysis
Law Enforcement
Drug Discovery
Intelligence and Crime Detection
Product & Process Innovation
360 view of customer
Compliance
Optimize Operations
Data Science
AI & ML
Fraud Prediction
Patient Journey
Customer Disambiguation
Transforming Industries
Modern Graph Theory Applications
13
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Knowledge
Graph
Identity & Access
Management
https://neo4j.com/use-cases/
https://neo4j.com/sandbox/
https://neo4j.com/graphgists/
Data connections has become the foundation of
business technology & created industry leaders
15
Relationships in RDBMS
● Require foreign keys, and possibly a lookup table
● Traversing a foreign key requires an index lookup
The purpose of graphs is to do rapid traversal. The RDBMS model is too
expensive for that.
Person
ID Name
1 Anne
2 James
3 Alex
Address
ID Country
1 Germany
2 USA
Lookup
Person Address
1 2
2 2
3 1
Joins are executed every time
you query the relationship
Executing a Join means to
search for a key
B-Tree Index: O(log(n))
Your data grows by 10x, your time
goes up by one step on each Join
More Data = More Searches
Slower Performance
The Problem
1
2
3
4
Relational Databases can’t handle Relationships
Degraded Performance
Speed plummets as data grows
and as the number of joins grows
Wrong Language
SQL was built with Set Theory in
mind, not Graph Theory
Not Flexible
New types of data and relationships
require schema redesign
Wrong Model
They cannot model or store
relationships without complexity
1
2
3
4
Relationships in RDBMS vs Graph
MATCH
(sub)-[:REPORTS_TO*0..3]->(boss),
(report)-[:REPORTS_TO*1..3]->(sub)
WHERE
boss.name = 'John Doe'
RETURN
sub.name AS Subordinate,
count(report) AS Total
Find all direct reports and how many people they manage, up to 3 levels down
Graph DB Query
(using Cypher Query Language)
SQL Query
18
Project Impact
Less time writing queries
• More time understanding the answers
• Leaving time to ask the next question
Less time debugging queries:
• More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:
• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
NoSQL Databases can’t handle Relationships
Degraded Performance
Speed plummets as you try to join
data together in the application
Wrong Languages
Lots of odd “almost sql” languages
terrible at “joins”
Not ACID
No support for transactions
Wrong Model
They cannot model or store
relationships without complexity
1
2
3
4
2
0
When To Use Graph?
21
Graph Databases: Designed for Connected Data
RELATIONAL DATABASES
Store and retrieve data
NoSQL DATABASES
Aggregate and filter data
Connections in data
Real time storage & retrieval
Real-Time Connected Insights
Long running queries
aggregation & filtering
“Our Neo4j solution is literally thousands of times faster than the
prior MySQL solution, with queries that require 10-100 times less code”
Volker Pacher, Senior Developer
From Disparate Silos
To Cross-Silo Connections
What is Neo4j?
23
In This Module You’ll Learn ...
At the end of this module, you should be able to:
● Describe the components and benefits of the Neo4j.
Connections in Data are as
Valuable as the Data Itself
Networks of People Transaction Networks
Bought
B
ou
gh
t
V
i
e
w
e
d
R
e
t
u
r
n
e
d
Bought
Knowledge Networks
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
K
n
o
w
s
Knows
Knows
K
n
o
w
s
Neo4j - The Graph Company
750+
7 of 10
20 of 25
7 of 10
53K+
100+
300+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
•Founders wrote the book on Graph
•Now wrote the book on Graph Algorithms
•Creator of the Neo4j Graph Platform
•~350 employees
•HQ in Silicon Valley, other offices include
Boston, London, Munich, Paris and Malmö
•Market: Neo4j is the clear leader. More
customers and usage than all other Graph
products combined (DB-Engines)
Ecosystem
SMB building products
based on Neo4j
Enterprise customers
Partners
Meetup members
Events per year
Industry’s Largest Dedicated Investment in Graphs
8 of 10 Top Insurance Providers
26
Harnessing Connections Drives Business Value
Enhanced Decision
Making
Hyper
Personalization
Massive Data
Integration
Data Driven Discovery
& Innovation
Product Recommendations
Personalized Health Care
Media and Advertising
Fraud Prevention
Network Analysis
Law Enforcement
Drug Discovery
Intelligence and Crime Detection
Product & Process Innovation
360 view of customer,
vendor, product, etc.
Compliance
Optimize Operations
Connected Data at the Center
AI & Machine
Learning
Price optimization
Product Recommendations
Resource allocation
Digital Transformation Megatrends
Neo4j – Re-Imagine Your Data as a Graph
Neo4j is an enterprise-grade graph database
that enables you to:
•Model and store your data as a graph
•Query data relationships with ease and in
real-time
•Seamlessly evolve applications to support
new requirements by
adding new kinds of data and relationships
● Agile development
● High performance
● Vertical and horizontal scale
● Seamless evolution
28
Store and apply granular access
control to the most sensitive
data
Designed for Enterprise-Grade Workloads
Find insights and connections
across Billions of nodes
Scalability Security Flexibility
Expand your graph database
to multiple use cases
Native Storage and Processing
Index Free Adjacency
Neo4j disk and
memory structures
link data directly,
allowing
millions graph
traversals per
second per core.
Graph data and
paths between data
do not have to be
pre-defined before
they can be used.
29
Transactional consistency - all updates either succeed or fail.
30
Neo4j Database ACID Transactions
● Atomicity
● Consistency
● Isolation
● Durability
ACID Consistency
Non-ACID
Graph DBMs (NoSQL)
Property Graph - Simply Powerful
Employee City
Company
Nodes represent
objects (nouns)
Relationships are directional
Relationships connect nodes
are represent actions (verbs)
Relationships can have
properties (name/value pairs)
Nodes can have
properties
(name/value pairs)
name: Amy Peters
date_of_birth: 1984-03-01
employee_ID: 1
:HAS_CEO
start_date: 2008-01-20
:LOCATED_IN
Modeling relational to graph
32
In some ways they’re similar:
Relational Graph
Rows Nodes
Joins Relationships
Table names Labels
Columns Properties
In some ways they’re not:
Relational Graph
Each column must have a field
value.
Nodes with the same label
aren't required to have the
same set of properties.
Joins are calculated at query
time.
Relationships are stored on disk
when they are created.
A row can belong to one table. A node can have many labels.
How we model: RDBMS vs graph
33
Relational Graph
Try and get the schema defined and then make
minimal changes to it after that.
It's common for the schema to evolve with the
application.
More abstract focus when modeling.
i.e. Focus on classes rather than objects.
Common to use actual data items when
modeling.
RDBMS vs graph models
34
players
id
name
position
clubs
id
name
country
transfers
id
fee
player_age
player_id
from_club_id
to_club_id
season
RDBMS Vocabulary Mapped to Graph Modeling
Relational DB Construct Graph DB Construct
Entity table Node labels
Row Node
Columns Node properties
Technical primary keys Replace with business primary
keys
Constraints Unique constraints for business
keys
Indexes Indexes on any property
Foreign keys Relationships
Default values Not required
De-normalized or duplicated
data
Create separate nodes
Join tables Relationships
Join table columns Relationship properties
Property Graph
Components
Neo4j’s Property Graph
• Nodes
• Relationships
• Labels
• Properties
37
Neo4j’s Property Graph
• Node = Vertex
• Relationship = Edge
38
Neo4j’s Property Graph
Nodes
• Represent objects
or entities
• Can be labeled
Car
Person Person
Person
39
Neo4j’s Property Graph
Nodes
• Represent objects or
entities
• Can be labeled
• May have properties
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
brand: “Volvo”
model: “V70”
year: 2010
Car
Person Person
40
DRIVES
LOVES
O
W
N
S
Neo4j’s Property Graph
Relationships
• Must have a type
• Must have a direction
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
brand: “Volvo”
model: “V70”
year: 2010
Car
Person Person
41
DRIVES
LOVES
O
W
N
S
Neo4j’s Property Graph
Relationships
• Must have a type
• Must have a direction
• May have properties
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
brand: “Volvo”
model: “V70”
year: 2010
Car
Person Person
since:
2018-10-1
42
LOVES
LIVES WITH
DRIVES
LOVES
O
W
N
S
Neo4j’s Property Graph
Relationships
• Must have a type
• Must have a direction
• May have properties
• Nodes can share
multiple relationships
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
brand: “Volvo”
model: “V70”
year: 2010
Car
Person Person
since:
2018-10-1
43
LOVES
LIVES WITH
DRIVES
LOVES
Neo4j’s Property Graph
Relationships
• Must have a type
• Must have a direction
• May have properties
• Nodes can share
multiple relationships since:
2018-10-1
44
O
W
N
S
Car
Person
Person
Person
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
brand: “Volvo”
model: “V70”
year: 2010
Neo4j Graph Platform
46
Neo4j Graph Platform
The Neo4j Graph Platform includes components that enable you to develop your
graph-enabled application. To better understand the Neo4j Graph Platform, you
will learn about these components and the benefits they provide.
The heart of the Neo4j Graph Platform is the Neo4j Database.
47
Neo4j DBMS: Clusters
Neo4j cluster support
• ACID across all locations
• Available in
Neo4j Enterprise Edition
Clusters provide:
• High availability
• Scalability
• For read access to data
• Failover
• A vital requirement for
many enterprises
Develop Applications Faster and Easier
Official Language Drivers
•Foundational drivers for popular
programming languages
•Bolt: streaming binary wire protocol
•Authoritative mapping to native type
system, uniform across drivers
•Pluggable into richer frameworks
48
JavaScript Java .NET Python Community
Drivers
Drivers
Bolt
Neo4j Advantage – Developer productivity
Go
49
Libraries
Out-of-the-box:
• Awesome Procedures on
Cypher (APOC)
• Graph Data Science
• GraphQL
Neo4j community has
contributed many specialized
libraries also.
50
Tools
• Neo4j Desktop *
• Neo4j Browser *
• Neo4j Cypher-shell
• Neo4j Bloom
• Neo4j ETL Tool
• Neo4j Graph
Algorithms
• Neo4j BI-Connector
Neo4j community has
contributed many
specialized tools also.
Neo4j Desktop: UI for developers & DB management
Supports “plugins”
• Neo4j official plugins
• Neo4j labs plugins
• 3rd
party plugins
• Bloom plugin for use with local databases
managed by desktop only
Allows you to manage local databases
• Create, stop, start, manage
• Add apoc procs, etc.
• See log files, configuration, etc.
Allows you to connect to remote
databases
• You can’t manage – but you can open browser
Supports organization via “projects”
Neo4j Browser
In reality
• Light weight web/javascript application
Purpose
• Cypher coding
• Quick/small visualizations
• Exporting result sets
Limitations – only one at a time
Available via your favorite web browser
• Same bolt protocol & UI
• Easy way to bypass the above limitation
https://www.youtube.com/watch?v=oHo-lQ79zf0&feature=youtu.be
Neo4j Bloom User Interface
53
Search with type-ahead
suggestions
Category icons and color
scheme
Visualize, Explore and Discover
Pan, Zoom and Select
Property Browser and editor
Neo4j Cypher-shell
54
Graph Algorithms in Neo4j
+4
5
neo4j.com/
graph-algorithms-
book/
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link
Prediction
Finds optimal paths
or evaluates route
availability and quality
Determines the
importance of distinct
nodes in the network
Detects group
clustering or partition
options
Evaluates how
alike nodes are
Estimates the likelihood
of nodes forming a
future relationship
Similarity
Visually Recognizing Patterns Believe it or not…
…the starting node was
not the one in the center
…the “bridging” entity
resolution nodes between
clusters were unexpected
Sometimes it is simple to see…. (1..3 hops)
MATCH p=(ah1:BusinessCustomer:AccountHolder)-[:MAKES_PAYMENTS_TO*1..3]->(ah2:BusinessCustomer:AccountHolder)
WHERE ah1.accountName="Lang and Sons"
AND ah2.accountName="Klein, Johnston and Glover"
RETURN p LIMIT 500
…other times it is chaos….(1..4 hops)
MATCH p=(ah1:BusinessCustomer:AccountHolder)-[:MAKES_PAYMENTS_TO*1..4]->(ah2:BusinessCustomer:AccountHolder)
WHERE ah1.accountName="Lang and Sons"
AND ah2.accountName="Klein, Johnston and Glover"
RETURN p LIMIT 500
It’s a matter of scale…..
The point?
Visualization tools can help….
• …but with any volume, attempting to recognize patterns visually is quickly overwhelming
This is where graph algorithms come in
• Entity resolution 🡪 disambiguation 🡪 similarities, link prediction
• Fraud networks 🡪 community detection, centrality
• Payment chaining 🡪 community detection, centrality, pathfinding/search
Use the results to re-visualize
• Set node size/colors based on graph algo weights/scores
Community
Detection
Detects group clustering or
partition options.
Centrality /
Importance
Determines the importance of
distinct nodes
Measures node similarity
based on neighbors and
relationships.
Similarity
Pathfinding
& Search
Finds optimal paths
or route availability and
quality.
Link Prediction
Estimates the likelihood of
nodes forming a future
relationship.
● Download at: https://kettle.be
○ Make sure to install Java 8
● Cross-platform drag-and-drop ETL
Workbench GUI
○ No coding required!
● Includes server components for
scheduling and running complex
jobs
○ Scales from local desktop use
to production server cluster
61
Kettle - An ETL Platform that Speaks Neo4j
• Best live, seamless integration of graph
data with your favorite BI tools
• Familiar UI for end users
• No development effort for IT
• Democratizing access to Neo4j data
• Free to adopt by BI teams of enterprise
edition customers
62
Neo4j BI Connector
The most popular BI tools can now talk live to
the world’s most popular graph db
Tableau
JDBC
Neo4j
BI Connector
SQL
Cypher
Business/Data Analyst
Investigator
Data Scientist
63
Graph Platform Surrounds Neo4j Enterprise
What Else is Online for help with Neo4j
Check your understanding
65
Question 1
66
What are some of the benefits provided by the Neo4j Graph Platform?
Select the correct answers.
❏ Database clustering
❏ ACID
❏ Index free adjacency
❏ Optimized graph engine
Answer 1
67
What are some of the benefits provided by the Neo4j Graph Platform?
Select the correct answers.
✅ Database clustering
✅ ACID
✅ Index free adjacency
✅ Optimized graph engine
Question 2
68
What libraries are included with Neo4j Graph Platform?
Select the correct answers.
❏ APOC
❏ JGraph
❏ Graph Data Science
❏ GraphQL
Answer 2
69
What libraries are included with Neo4j Graph Platform?
Select the correct answers.
✅ APOC
❏ JGraph
✅ Graph Data Science
✅ GraphQL
Question 3
70
What are some of the language drivers that come with Neo4j out of the box?
Select the correct answers.
❏ Java
❏ Ruby
❏ Python
❏ JavaScript
Answer 3
71
What are some of the language drivers that come with Neo4j out of the box?
Select the correct answers.
✅ Java
❏ Ruby
✅ Python
✅ JavaScript
72
Summary
You should be able to:
● Describe the components and benefits of the Neo4j Graph Platform.
Getting around in Neo4j
Desktop & Browser
Note: Much of this was covered in videos in the instructions sent before
class, consequently, we are going to cover this quite quickly
74
Overview
At the end of this module, you should be able to:
● Start using Neo4j Desktop / Neo4j Sandbox
● Start using Neo4j Browser
Neo4j Desktop
75
• Full featured Neo4j Enterprise
Edition
• Single user license
• Runs on your laptop or
desktop computer
• 4-core max
• Includes Browser
• Includes Free Bloom
Visualization License
76
Neo4j Sandbox
• Web browser access to Neo4j
Database Server and Neo4j
Database in the cloud
• Comes with a blank or
pre-populated database
• Temporary access - Instance lives
for up to ten days
• No need to install Neo4j on your
machine
https://neo4j.com/cloud/aura/
Neo4j Aura
• Database as a Service
• Various configurations to
choose from.
• Scale up or down
• Pay only for the amount of
time you use it.
• Runs in the cloud. No need to
install Neo4j on your machine.
• Includes Bloom Visualization
tool
Neo4j Desktop: UI for developers & DB management
Supports “plugins”
• Neo4j official plugins
• Neo4j labs plugins
• 3rd
party plugins
• Bloom plugin for use with local databases
managed by desktop only
Allows you to manage local databases
• Create, stop, start, manage
• Add apoc procs, etc.
• See log files, configuration, etc.
Allows you to connect to remote
databases
• You can’t manage – but you can open browser
Supports organization via “projects”
Neo4j Desktop (1.4.1)
79
Project Folders
Create Graph
(database)
Start Browser
(same as http://localhost:7474)
Add Plugin
• Graph Algorithms
• APOC’s (procedures)
• GraphQL
License keys
License keys
• Select “Add software key”
• Copy/paste link
• Only manages license keys for local database instances
managed by Neo4j desktop
• Accessed through the Desktop or Web Browser (localhost:7474)
Neo4j Browser 101
81
$ Enter Queries / Commands Here
Desktop
Web Browser
Start the browser
Enter Queries
Display Options
• Change node colors
• Change which node property is displayed
• Double-click a node and see what happens!
Query Editing
• Use :clear to clear past results
• with (CMD) ⌘ + Arrow / (CTRL) ^ + Arrow to scroll through past queries
• Other useful commands :history :clear :help
• Run queries with (CMD) ⌘ + Enter / (CTRL) ^ + Enter
• Insert new line with SHIFT + Enter
• Expand the query bar with ESC
Neo4j Browser 101
82
https://neo4j.com/developer/guide-neo4j-browser/
Where to get syntax help in Browser
83
Set browser for multi-statement
Click settings (gear) in lower left pane
Select “Enable multi-statement query editor”
• Many customers keep constraints, etc. in scripts with ;’s –
this option allows you to execute the scripts without error
• One note of caution is that if you actually issue
multiple-statements, you can only see the completion state
– not the results as normal.
Note: “Connect Result Nodes”
• This is extremely useful – BUT – it comes at a cost in that
after a query executes, desktop issues a plethora of queries
to find all the connections between the nodes even when
not mentioned in the query.
✔ Between this and rendering the graph with auto-layout, this
is the reason it seems that Neo4j Browser can take 3 minutes
to return a query that supposedly runs in 50ms
• For long running complex statements in which the result
returns the desired connections and you don’t care to see
others, de-select this option.
• Keep selected for this class
85
Neo4j Desktop
• Create local databases
• Manage multiple projects
• Manage Database Server
• Start Neo4j Browser
instances
• Install plugins (libraries) for
use with a project
• OS X, Linux, Windows
Introduction to Cypher
In This Module You’ll Learn ...
How to write Cypher statements to ...
● Retrieve nodes from the graph
● Filter nodes retrieved using labels and node property values
● Retrieve node property values
● Filter retrieved nodes using relationships
Additional information is available from these sources:
● Neo4j Cypher Manual (https://neo4j.com/docs/)
● Cypher Reference card (https://neo4j.com/docs/cypher-refcard/current/)
87
A pattern matching query
language made for graphs
• Declarative & Expressive
(what to find, not how to find it)
• Pattern Matching
88

The Cypher Query Language
(:Person {name:'Dan'})-[:LOVES]->(:Person {name:'Ann'})
-[:OWNS]->(:Car {brand:'Volvo'})
(anything)-[:DRIVES]->(something)
Express Graph Patterns with ASCII ART ¯_(ツ)_/¯
Cypher Query Language – the MATCH clause
LOVES
Dan Ann
NODE
PROPERTY
LABEL
RELATIONSHIP
( )
MATCH – Return Data
RETURN
-[ :LOVES ]->
MATCH( )
:Person)
{ name:"Dan"} )
n r x
Variables
n,r,x
Dan
Ann LOVES Car
LOVES
Node Syntax
()
(p)
90
Label Syntax
(:Person)
(p:Person)
(:Location)
(l:Location)
(x:Residence)
(x:Location:Residence)
91
Comments in Cypher
// anonymous node not be referenced later in the query
()
(p) // variable p, a reference to a node used later
(:Person) // anonymous node of type Person
(p:Person) // p, a reference to a node of type Person
// p, a reference to a node of types Actor and Director
(p:Actor:Director)
92
MATCH and RETURN
93
Viewing the Data Model
CALL db.schema.visualization
94
MATCH and RETURN
Syntax examples for a query:
MATCH (variable)
RETURN variable
MATCH (variable:Label)
RETURN variable
Retrieve all nodes:
MATCH (n) // returns all nodes in the graph
RETURN n
95
Retrieve all Person Nodes
MATCH (p:Person) // returns all Person nodes in the graph
RETURN p
Viewing Nodes as Table Data
97
Exercise 1: Retrieving Nodes
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 1
:play 4.0-intro-neo4j-exercises
Note This exercise has 4 steps. Estimated time to complete: 10 minutes
98
Properties
Filtering Query by
Year Born
MATCH (p:Person {born: 1970})
RETURN p
Property filter
100
Filtering Query by Multiple
Properties
MATCH (m:Movie {released: 2003, tagline: 'Free your mind'})
RETURN m
Multiple property filters
101
Returning Property Values
MATCH (p:Person {born:
1965})
RETURN p.name, p.born
Property values
102
103
Specifying Aliases for Column Headings
MATCH (p:Person {born: 1965})
RETURN p.name AS name, p.born AS `birth year`
Column headings
Exercise 2: Filtering Queries
Using Property Values
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 2
:play 4.0-intro-neo4j-exercises
Note This exercise has 6 steps. Estimated time to complete: 15 minutes
104
Relationships
105
Relationships
● Directed connection between two nodes
● Relationships have a type (name)
● Relationships can have properties, just like
nodes
● Relationships are key to traversing a graph
106
107
Anonymous nodes/relationships
Named node/relationship
• Node label or relationship type is specified
Anonymous node/relationship
• Node/relationships are not specified – “empty” placeholders in cypher
() // a node...any node
()--() // 2 nodes have some type of relationship (any direction)
()-->() // the first node has a relationship to the second node
()-[]->() // same as above
()<--() // the second node has a relationship to the first node
()<-[]-() // ditto
108
Querying using relationships
Person Person
Location
Residence
MARRIED
LIVES_AT
LIVES_AT
OWNS
MATCH (p:Person)-[:LIVES_AT]->(h:Residence)
RETURN p.name, h.address
MATCH (p:Person)--(h:Residence) // any relationship
RETURN p.name, h.address
When using a “named”
relationship, Neo4j can
quickly traverse only
those relationships and
test if opposite node is
the correct label
When using an “anonymous”
relationship, Neo4j has to
traverse every relationship
and then inspect every
node to see if the desired
label – obviously may be
slower – but increases
flexibility
Using a Relationship in a Query
Find all people who acted in the
movie ‘The Matrix’, and return the
nodes and relationships
MATCH (p:Person)-[rel:ACTED_IN]->(m:Movie {title: 'The Matrix'})
RETURN p, rel, m
Relationship
109
Querying Using Multiple
Relationships
Find all movies that Tom Hanks acted in
or directed and return the titles of the
movies
MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN|DIRECTED]->(m:Movie)
RETURN p.name, m.title
Multiple Relationships
110
No node variable specified here
Using Anonymous Nodes in a Query
MATCH (p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'})
RETURN p.name
Find all people who acted in the movie
‘The Matrix’ and return their names
111
Using an
Anonymous
Relationship for a
Query
MATCH (p:Person)-->(m:Movie {title: 'The Matrix'})
RETURN p, m
Find all people who have any type of
relationship to the movie ‘The
Matrix’, and return the nodes and
relationships
Anonymous Relationship
112
More Anonymous Relationships
It is recommended that empty brackets [ ] not be used
MATCH (p:Person)--(m:Movie {title: 'The Matrix'})
RETURN p, m
MATCH (m:Movie)<--(p:Person {name: 'Keanu Reeves'})
RETURN p, m
MATCH (p:Person)-[]-(m:Movie {title: 'The Matrix'})
RETURN p, m
113
Retrieving the Relationship Types
There is a built-in function,
type() that returns the
type of a relationship
MATCH (p:Person)-[rel]->(:Movie {title:'The Matrix'})
RETURN p.name, type(rel)
type() function
114
Properties for Relationships
115
Filtering Using Relationship Properties
Find all people that gave the movie ‘The Da Vinci Code’ a rating of 65
and return their names.
MATCH (p:Person)-[:REVIEWED {rating: 65}]->(:Movie {title: 'The Da Vinci Code'})
RETURN p.name
Property filter
116
Traversing a Graph
117
Patterns in the Graph
118
Using Patterns for Queries
MATCH (p:Person)-[:FOLLOWS]->(:Person {name:'Angela Scope'})
RETURN p
Looking for people that follow Angela
119
120
Reversing the Traversal
MATCH (p:Person)<-[:FOLLOWS]-(:Person {name:'Angela Scope'})
RETURN p
Looking for people that Angela follows
121
Querying a Relationship in Both Directions
MATCH (p1:Person)-[:FOLLOWS]-(p2:Person {name:'Angela Scope'})
RETURN p1, p2
Traversing Multiple Relationships
Query to return all
followers of the
followers of Jessica
Thompson
MATCH (p:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]->
(:Person {name:'Jessica Thompson'})
RETURN p
122
123
Using Patterns to Focus the Query
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name, m.title, d.name
Returning Paths
MATCH path = (:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]->(:Person {name:'Jessica Thompson'})
RETURN path
Path assigned to variable path
124
Returning Multiple Paths
MATCH path = (:Person)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(:Person {name:'Ron Howard'})
RETURN path
Best practice
● Specify direction in MATCH
statements
● It optimizes queries,
especially for larger graphs
125
Here are the Neo4j-recommended Cypher coding standards:
● Node labels are PascalCase and case-sensitive (examples: Person,
NetworkAddress).
● Property keys, variables, parameters, aliases, and functions are
camelCase and case-sensitive (examples: businessAddress, title).
● Relationship types are in upper-case and can use the underscore.
(examples: ACTED_IN, FOLLOWS).
● Cypher keywords are upper-case (examples: MATCH, RETURN).
126
Cypher Style Recommendations (1 of 2)
Here are more Neo4j-recommended Cypher coding standards:
● String constants are in single quotes.
● Specify variables only when needed for use later in the Cypher statement.
● Place named nodes and relationships (that use variables) before
anonymous nodes and relationships in your MATCH clauses when
possible.
● Specify anonymous relationships with -->, --, or <--
127
Cypher Style Recommendations (2 of 2)
Exercise 3: Filtering Queries
Using Relationships
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 3
:play 4.0-intro-neo4j-exercises
Note This exercise has 6 steps. Estimated time to complete: 15 minutes
128
Check Your
Understanding
129
Question 1
Suppose you have a graph that contains nodes representing customers and other business
entities for your application. The node label in the database for a customer is Customer. Each
Customer node has a property named email that contains the customer’s email address.
What Cypher query do you execute to return the email addresses for all customers in the
graph?
Select the correct answer:
❏ MATCH (n) RETURN n.Customer.email
❏ MATCH (c:Customer) RETURN c.email
❏ MATCH (Customer) RETURN email
❏ MATCH (c) RETURN Customer.email
130
Question 1
Suppose you have a graph that contains nodes representing customers and other business
entities for your application. The node label in the database for a customer is Customer. Each
Customer node has a property named email that contains the customer’s email address.
What Cypher query do you execute to return the email addresses for all customers in the
graph?
Select the correct answer:
❏ MATCH (n) RETURN n.Customer.email
❏ MATCH (c:Customer) RETURN c.email
❏ MATCH (Customer) RETURN email
❏ MATCH (c) RETURN Customer.email
131
Question 2
Suppose you have a graph that contains Customer and Product nodes. A Customer node can have a
BOUGHT relationship with a Product node. Customer nodes can have other relationships with
Product nodes. A Customer node has a property named customerName. A Product node has a
property named productName. What Cypher query do you execute to return all of the products (by
name) bought by customer 'ABCCO'.
Select the correct answer:
❏ MATCH (c:Customer {customerName: 'ABCCO'}) RETURN c.BOUGHT.productName
❏ MATCH (:Customer 'ABCCO')-[:BOUGHT]→(p:Product) RETURN p.productName
❏ MATCH (p:Product)←[:BOUGHT_BY]-(:Customer 'ABCCO') RETURN p.productName
❏ MATCH (:Customer {customerName: 'ABCCO'})-[:BOUGHT]→(p:Product)
RETURN p.productName
132
Question 2
Suppose you have a graph that contains Customer and Product nodes. A Customer node can have a
BOUGHT relationship with a Product node. Customer nodes can have other relationships with
Product nodes. A Customer node has a property named customerName. A Product node has a
property named productName. What Cypher query do you execute to return all of the products (by
name) bought by customer 'ABCCO'.
Select the correct answer:
❏ MATCH (c:Customer {customerName: 'ABCCO'}) RETURN c.BOUGHT.productName
❏ MATCH (:Customer 'ABCCO')-[:BOUGHT]→(p:Product) RETURN p.productName
❏ MATCH (p:Product)←[:BOUGHT_BY]-(:Customer 'ABCCO') RETURN p.productName
❏ MATCH (:Customer {customerName: 'ABCCO'})-[:BOUGHT]→(p:Product)
RETURN p.productName
133
Question 3
When must you use a variable in a MATCH clause?
Select the correct answer:
❏ When you want to query the graph using a node label
❏ When you specify a property value to match the query
❏ When you want to use the node or relationship to return a value
❏ When the query involves 2 types of nodes
134
Question 3
When must you use a variable in a MATCH clause?
Select the correct answer:
❏ When you want to query the graph using a node label
❏ When you specify a property value to match the query
❏ When you want to use the node or relationship to return a value
❏ When the query involves 2 types of nodes
135
Summary
You should now be able to write Cypher statements to:
● Retrieve nodes from the graph
● Filter nodes retrieved using labels and node property values
● Retrieve node property values
● Filter retrieved nodes using relationships
Using WHERE to Filter
Queries
In This Module You’ll Learn ...
How to write Cypher WHERE clauses for testing:
● Equality
● Multiple values
● Ranges
● Labels
● Existence of a property
● String values
● Regular expressions
● Patterns in the graph
● Inclusion in a list
138
Cypher WHERE
139
Cypher WHERE
Not using WHERE:
Using the WHERE clause:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie {released: 2008})
RETURN p, m
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008
RETURN p, m
140
Querying Multiple Values
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released = 2008 OR m.released = 2009
RETURN p, m
When working with WHERE a variable is required for each value
141
Querying Ranges
142
Querying Ranges
Find all people who acted in movies released between 2003 and 2004
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.released >= 2003 AND m.released <= 2004
RETURN p.name, m.title, m.released
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE 2003 <= m.released <= 2004 // floor and ceiling notation
RETURN p.name, m.title, m.released
Supported comparison operators: =, <>, > , <=, >=, IS NULL, IS NOT NULL
143
144
Testing
Range
Output
Querying Using Labels
145
Querying Using Labels
MATCH (p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'})
RETURN p.name
MATCH (p)-[:ACTED_IN]->(m)
WHERE p:Person AND m:Movie AND m.title='The Matrix'
RETURN p.name
MATCH (p:Person)
RETURN p.name
MATCH (p)
WHERE p:Person
RETURN p.name
Simplification of the two
queries above showing only
label Person and variable p
146
Existence of a Property
147
Filter on Existence of a Property
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name ='Jack Nicholson' AND exists(m.tagline)
RETURN m.title, m.tagline
148
Querying Using Strings
149
150
Querying using Strings
Find all actors whose
first name is Michael
MATCH (p:Person)-[:ACTED_IN]->()
WHERE p.name STARTS WITH 'Michael'
RETURN p.name
String Comparisons
● String comparisons are case-sensitive
● Use toLower( ) and toUpper( )
● Indexes are not used if a property value has
been converted with a function
MATCH (p:Person)-[:ACTED_IN]->()
WHERE toLower(p.name) STARTS WITH 'michael'
RETURN p.name
151
Querying Using
Regular Expressions
152
153
Querying with Regular
Expressions
● Indexes are never used for regular expression
● The property value must fully match the regular
expression
MATCH (p:Person)
WHERE p.name =~'Tom.*'
RETURN p.name
Graph Patterns
154
Patterns (1 of 3)
Return all Person nodes of
people who wrote movies
MATCH (p:Person)-[:WROTE]->(m:Movie)
RETURN p.name, m.title
155
Patterns (2 of 3)
The query is modified
to exclude people
who directed that
particular movie
MATCH (p:Person)-[:WROTE]->(m:Movie)
WHERE NOT exists( (p)-[:DIRECTED]->(m) )
RETURN p.name, m.title
156
Patterns (3 of 3)
Find Gene Hackman and ...
● The movies that he
ACTED-IN with another
person who also
DIRECTED the movie
MATCH (gene:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(other:Person)
WHERE gene.name= 'Gene Hackman'
AND exists( (other)-[:DIRECTED]->(m) )
RETURN gene, other, m
157
List Values
158
List Values
Retrieve
Person nodes
of people born in
1965 or 1970
MATCH (p:Person)
WHERE p.born IN [1965, 1970]
RETURN p.name as name, p.born as yearBorn
159
List Values in the Graph
Later in this course, you will learn how to create lists from your queries by aggregating
data in the graph.
There are a number of syntax elements of Cypher that we have not covered in this
training. For example, you can specify CASE logic in your conditional testing for your
WHERE clauses. You can learn more about these syntax elements in the Neo4j Cypher
Manual and the Cypher Refcard.
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE 'Neo' IN r.roles AND m.title='The Matrix'
RETURN p.name
160
Exercise 4: Filtering Queries Using
the WHERE Clause
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 4
:play 4.0-intro-neo4j-exercises
Note This exercise has 6 steps. Estimated time to complete: 30 minutes
161
Check Your
Understanding
162
Question 1
Suppose you want to add a WHERE clause at the end of this statement to filter the results retrieved.
MATCH (p:Person)-[rel]->(m:Movie)<-[:PRODUCED]-(:Person)
What variables, can you test in the WHERE clause?
Select the correct answers.
❏ p
❏ rel
❏ m
❏ PRODUCED
Question 1
Suppose you want to add a WHERE clause at the end of this statement to filter the results retrieved.
MATCH (p:Person)-[rel]->(m:Movie)<-[:PRODUCED]-(:Person)
What variables, can you test in the WHERE clause?
Select the correct answers.
❏ p
❏ rel
❏ m
❏ PRODUCED
Question 2
Suppose you want to retrieve all movies that have a released property value that is 2000, 2002, 2004,
2006, or 2008. Here is an incomplete Cypher example to return the title property values of all movies
released in these years. What keyword do you specify for XX?
MATCH (m:Movie)
WHERE m.released XX [2000, 2002, 2004, 2006, 2008]
RETURN m.title
Select the correct answer:
❏ CONTAINS
❏ IN
❏ IS
❏ EQUALS
Question 2
Suppose you want to retrieve all movies that have a released property value that is 2000, 2002, 2004,
2006, or 2008. Here is an incomplete Cypher example to return the title property values of all movies
released in these years. What keyword do you specify for XX?
MATCH (m:Movie)
WHERE m.released XX [2000, 2002, 2004, 2006, 2008]
RETURN m.title
Select the correct answer:
❏ CONTAINS
❏ IN
❏ IS
❏ EQUALS
Question 3
We want a query that returns the names of any people who both acted in and wrote the same
movie. What query will retrieve this data?
Select the correct answer.
❏ MATCH (p:Person) WHERE (p)-[:WROTE]-(m) AND (p)-[WROTE]-(m) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie) WHERE (p)-[:WROTE]-(m) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN | WROTE]→(m:Movie) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie)←[WROTE]-(p) RETURN p.name, m.title
Question 3
We want a query that returns the names of any people who both acted in and wrote the same
movie. What query will retrieve this data?
Select the correct answer.
❏ MATCH (p:Person) WHERE (p)-[:WROTE]-(m) AND (p)-[WROTE]-(m) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie) WHERE (p)-[:WROTE]-(m) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN | WROTE]→(m:Movie) RETURN p.name, m.title
❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie)←[WROTE]-(p) RETURN p.name, m.title
Summary
You should now be able to write Cypher WHERE clauses to test:
● Equality
● Multiple values
● Ranges
● Labels
● Existence of a property
169
● String values
● Regular expressions
● Patterns in the graph
● Inclusion in a list
Working with
Patterns in Queries
In This Module You’ll Learn ...
How to write Cypher statements to ...
● Specify multiple MATCH clauses
● Specify multiple MATCH patterns
● Specify varying length paths
● Return a subgraph
● Specify OPTIONAL in a query
171
MATCH Clauses
172
173
Traversal in a MATCH Clause
Find all of the followers of people who
reviewed the movie titled The Replacements
MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie)
WHERE m.title = 'The Replacements'
RETURN follower.name, reviewer.name
MATCH Patterns
174
175
Multiple Patterns in a MATCH
MATCH (a:Person)-[:ACTED_IN]->(m:Movie),
(m)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name
A Single Pattern in a MATCH
Another way to write this same query ...
MATCH
(a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
WHERE m.released = 2000
RETURN a.name, m.title, d.name
176
Required Two Patterns in a MATCH
MATCH (meg:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person),
(other:Person)-[:ACTED_IN]->(m)
WHERE meg.name = 'Meg Ryan'
RETURN m.title AS movie, d.name AS director , other.name AS `co-actors`
177
Two Patterns in a MATCH
MATCH (keanu:Person)-[:ACTED_IN]->(movie:Movie)<-[:ACTED_IN]-(n:Person),
(hugo:Person)
WHERE keanu.name='Keanu Reeves' AND
hugo.name='Hugo Weaving'
AND NOT (hugo)-[:ACTED_IN]->(movie)
RETURN n.name
178
Traversal With Patterns
MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie)
MATCH (actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title AS movie, actor.name
179
Traversal Multiple Patterns
MATCH (valKilmer:Person)-[:ACTED_IN]->(m:Movie),
(actor:Person)-[:ACTED_IN]->(m)
WHERE valKilmer.name = 'Val Kilmer'
RETURN m.title as movie , actor.name
180
Varying Length Paths
181
Varying Length Paths
MATCH (follower:Person)-[:FOLLOWS*2]->(p:Person)
WHERE follower.name = 'Paul Blythe'
RETURN p.name
182
Varying Length Patterns (1 of 2)
Retrieve all paths of any length with relationship …
:RELTYPE from nodeA to nodeB and beyond
(nodeA)-[:RELTYPE*]->(nodeB)
(nodeA)-[:RELTYPE*]-(nodeB)
Retrieve all paths of any length with the relationship
:RELTYPE from nodeA to nodeB
or from nodeB to nodeA and beyond
Usually this is a very expensive query so limit the retrieved nodes
Direction removed
183
Varying Length Patterns (2 of 2)
Retrieve the paths of length 3 with the relationship …
:RELTYPE from nodeA to nodeB
Retrieve the paths of lengths 1, 2, or 3 with the relationship …
:RELTYPE from nodeA to nodeB, nodeB to nodeC
and from nodeC to _nodeD (up to 3 hops)
(node1)-[:RELTYPE*3]->(node2)
(node1)-[:RELTYPE*1..3]->(node2)
184
Finding the Shortest Path
MATCH p = shortestPath((m1:Movie)-[*]-(m2:Movie))
WHERE m1.title = 'A Few Good Men' AND
m2.title = 'The Matrix'
RETURN p
185
Returning a Subgraph
186
Returning a Subgraph
MATCH paths = (m:Movie)-[rel]-(p:Person)
WHERE m.title = 'The Replacements'
RETURN paths
187
OPTIONAL MATCH
188
189
Specifying Optional Pattern Matching
Subgraph of the movies
graph with all people
named James and their
relationships
MATCH (p:Person)
WHERE p.name STARTS WITH 'James'
OPTIONAL MATCH (p)-[r:REVIEWED]->(m:Movie)
RETURN p.name, type(r), m.title
Exercise 5: Working with Patterns
in Queries
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 5
:play 4.0-intro-neo4j-exercises
Note This exercise has 6 steps. Estimated time to complete: 30 minutes
190
Check Your
Understanding
191
Question 1
Given this Cypher query:
MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie)
WHERE m.title = 'The Replacements' RETURN follower.name, reviewer.name
What is the first node that is retrieved by the query engine?
Select the correct answer:
❏ The first Person node with a FOLLOWS relationship
❏ The first Person node with a REVIEWED relationship
❏ The Movie node for the movie, The Replacements
❏ The first Movie node in the alphabetical list of movies in the graph
Question 1
Given this Cypher query:
MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie)
WHERE m.title = 'The Replacements' RETURN follower.name, reviewer.name
What is the first node that is retrieved by the query engine?
Select the correct answer:
❏ The first Person node with a FOLLOWS relationship
❏ The first Person node with a REVIEWED relationship
❏ The Movie node for the movie, The Replacements
❏ The first Movie node in the alphabetical list of movies in the graph
Question 2
We want a query that returns a list of people who acted in movies released later than 2005 and for
those movies, also return title and released year of the movie, as well as the name of the writer. How
can you correct this query?
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
(m)<-[:WROTE]-(w:Person)
WHERE m.released > 2005
RETURN a.name, m.title, m.released, w.name
Select the correct answer:
❏ The second line should be: (m2:Movie)←[:WROTE]-(w:Person).
❏ Add a comma after the first pattern in the MATCH clause.
❏ The second line should be: (m2:Movie)←[:WROTE]-(a).
❏ Add a MATCH clause at the beginning of the second line.
Question 2
We want a query that returns a list of people who acted in movies released later than 2005 and for
those movies, also return title and released year of the movie, as well as the name of the writer. How
can you correct this query?
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
(m)<-[:WROTE]-(w:Person)
WHERE m.released > 2005
RETURN a.name, m.title, m.released, w.name
Select the correct answer:
❏ The second line should be: (m2:Movie)←[:WROTE]-(w:Person).
❏ Add a comma after the first pattern in the MATCH clause.
❏ The second line should be: (m2:Movie)←[:WROTE]-(a).
❏ Add a MATCH clause at the beginning of the second line.
Question 3
Suppose you have a graph of Person nodes representing a social network graph. A Person node can
have a IS_FRIENDS_WITH relationship with any other Person node. Like in Facebook, there can be a
long path of connections between people. What Cypher MATCH clause would you use to find all
people in this graph that are two to four hops away from each other?
Select the correct answer:
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2..4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2-4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2-4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2,4]→(p2.Person)
Question 3
Suppose you have a graph of Person nodes representing a social network graph. A Person node can
have a IS_FRIENDS_WITH relationship with any other Person node. Like in Facebook, there can be a
long path of connections between people. What Cypher MATCH clause would you use to find all
people in this graph that are two to four hops away from each other?
Select the correct answer:
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2..4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2-4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2-4]→(p2.Person)
❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2,4]→(p2.Person)
Summary
You should now be able to write Cypher statements to ...
● Specify multiple MATCH clauses
● Specify multiple MATCH patterns
● Specify varying length paths
● Return a subgraph
● Specify OPTIONAL MATCH in a query
198
Working with Cypher Data
In This Module You’ll Learn ...
How to write Cypher statements to:
● Aggregate data into lists
● Work with lists
● Count results returned
● Work with maps
● Work with dates
200
Aggregating Data
201
Automatic Grouping in Cypher
MATCH (p:Person)-[:REVIEWED]->(m:Movie)
RETURN p.name, m.title
202
Movie titles default grouping
By default Cypher automatically returns
values grouped by a common value
Aggregation Using collect()
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name ='Tom Cruise'
RETURN collect(m.title) AS `movies for Tom Cruise`
203
204
Collecting Nodes
● Returned as a graph
● The same as simply
returning m
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name ='Tom Cruise'
RETURN collect(m) AS `movies for Tom Cruise`
● Result viewed as a table
● Each node is an object
in the list
Aggregation Using count()
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name, d.name, count(m)
205
Counting and
Collecting
MATCH (actor:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(director:Person)
RETURN actor.name, director.name,
count(m) AS collaborations, collect(m.title) AS movies
206
Using collect() and size()
Using size() is an alternative to using count()
● size() returns the number of elements in a list
● count() returns the count for a set.
This query shows returns the same result:
MATCH (actor:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(director:Person)
RETURN actor.name, director.name, size(collect(m)) AS collaborations,
collect(m.title) AS movies
207
208
Working With Cypher Data
Movie nodes have 3 properties
● 2 of type String
● 1 of type Integer
Working with Lists
209
Lists
Return the cast list for every movie and the size of the cast
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title, collect(a) AS cast, size(collect(a)) AS castSize
210
Using Strings in Lists
Modifying the query slightly ...
● The list contains the names, instead of the entire set of Person node properties
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title, collect(a.name) AS cast, size(collect(a.name)) AS castSize
211
212
Accessing Elements of the List
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title, collect(a.name)[0] AS `A cast member`,
size(collect(a.name)) AS castSize
Working with Maps
213
Working With Maps
RETURN {Jan: 31, Feb: 28, Mar: 31, Apr: 30 , May: 31, Jun: 30 ,
Jul: 31, Aug: 31, Sep: 30, Oct: 31, Nov: 30, Dec: 31}['Feb']
AS DaysInFeb
214
Accessing Map Elements
A map is returned ...
● when a returned node is displayed using
Table in Neo4j Browser
The returned Movie nodes are displayed here
as a map
215
Map Projections
MATCH (m:Movie)
WHERE m.title CONTAINS 'Matrix'
RETURN m { .title, .released } AS movie
216
Working with Dates
217
Working With Dates
RETURN date(), datetime(), time(), timestamp()
218
Accessing Components of Dates
RETURN date().day,
date().year,
datetime().year,
datetime().hour,
datetime().minute
219
These functions work
with strings
Working With timestamp()
RETURN
datetime({epochmillis:timestamp()}).day,
datetime({epochmillis:timestamp()}).year,
datetime({epochmillis:timestamp()}).month
220
timestamp() is a
long integer
Type and Data Conversions
Here are some of the built-in conversion functions:
● toInteger()
● toLower()
● toUpper( )
● toString()
Consult the Neo4j Cypher Manual for more information
● It includes much more on the built-in functions that are available
221
Exercise 6: Working with Cypher
Data
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 6
:play 4.0-intro-neo4j-exercises
Note This exercise has 6 steps. Estimated time to complete: 15 minutes
222
Check Your
Understanding
223
Question 1
What functions below aggregate results:
Select the correct answers:
❏ count()
❏ size()
❏ map()
❏ collect()
Question 1
What functions below aggregate results:
Select the correct answers:
❏ count()
❏ size()
❏ map()
❏ collect()
Question 2
What construct best represents a node in the graph?
Select the correct answer:
❏ list
❏ map
❏ collection
❏ blob
Question 2
What construct best represents a node in the graph?
Select the correct answer:
❏ list
❏ map
❏ collection
❏ blob
Question 3
Which date/time related function returns a long integer value?
Select the correct answer:
❏ date()
❏ datetime()
❏ time()
❏ timestamp()
Question 3
Which date/time related function returns a long integer value?
Select the correct answer:
❏ date()
❏ datetime()
❏ time()
❏ timestamp()
Summary
You should now be able to:
● Aggregate data into lists
● Work with lists
● Count results returned
● Work with maps
● Work with dates
230
Controlling the Query Chain
In This Module You’ll Learn ...
How to write Cypher statements to:
● Perform intermediate processing with WITH
● Using WITH and UNWIND for query processing
● Perform subqueries with WITH
● Perform subqueries with CALL
232
Intermediate Processing
Using WITH
233
Intermediate Processing Using WITH
Return each actor ...
● the number of movies they acted in
● and the titles of the movies
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a.name, count(a) AS numMovies,
collect(m.title) AS movies
234
Using WITH
Existing variables must be specified in the WITH
to be available for reference later in the query
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a, count(a) AS numMovies, collect(m.title) AS movies
WHERE 1 < numMovies < 4
RETURN a.name, numMovies, movies
235
Using WITH
and UNWIND
236
Using WITH and UNWIND
When importing data into a graph -
WITH and UNWIND are frequently utilized
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WITH collect(p) AS actors, count(p) AS actorCount, m
UNWIND actors AS actor
RETURN m.title, actorCount, actor.name
237
Subqueries with WITH
MATCH (m:Movie)<-[rv:REVIEWED]-(r:Person)
WITH m, rv, r
MATCH (m)<-[:DIRECTED]-(d:Person)
RETURN m.title, rv.rating, r.name, collect(d.name)
238
Subqueries
239
240
Subquery
MATCH (p:Person)
WITH p, size((p)-[:ACTED_IN]->()) AS movies
WHERE movies >= 5
OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
RETURN p.name, m.title
Performing Subqueries with CALL
Variable m in the
subquery is used again in
the next query
CALL
{MATCH (p:Person)-[:REVIEWED]->(m:Movie)
RETURN m}
MATCH (m) WHERE m.released=2000
RETURN m.title, m.released
241
Exercise 7: Controlling Query Processing
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 7
:play 4.0-intro-neo4j-exercises
Note This exercise has 5 steps. Estimated time to complete: 15 minutes
242
Check Your
Understanding
243
Question 1
Given this code snippet, what variables can you use in the RETURN clause?
MATCH (a:Person)-[r:ACTED_IN]->(m:Movie)
WITH a, count(a) AS numMovies
WHERE 1 < numMovies < 4
RETURN ??
Select the correct answers:
❏ a
❏ r
❏ m
❏ numMovies
Question 1
Given this code snippet, what variables can you use in the RETURN clause?
MATCH (a:Person)-[r:ACTED_IN]->(m:Movie)
WITH a, count(a) AS numMovies
WHERE 1 < numMovies < 4
RETURN ??
Select the correct answers:
❏ a
❏ r
❏ m
❏ numMovies
Question 2
What clauses enable you to perform subqueries?
Select the correct answers:
❏ SUBMATCH
❏ WITH
❏ QUERY
❏ CALL
Question 2
What clauses enable you to perform subqueries?
Select the correct answers:
❏ SUBMATCH
❏ WITH
❏ QUERY
❏ CALL
Question 3
Given this Cypher query, what Cypher clause do you use here to turn the list of movies into rows?
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WITH collect(m) AS movies,count(m) AS movieCount, p
?? movies AS movie
RETURN p.name, movieCount, movie.title
Select the correct answer:
❏ ELEMENTS
❏ UNWIND
❏ ROWS
❏ SELECT
Question 3
Given this Cypher query, what Cypher clause do you use here to turn the list of movies into rows?
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WITH collect(m) AS movies,count(m) AS movieCount, p
?? movies AS movie
RETURN p.name, movieCount, movie.title
Select the correct answer:
❏ ELEMENTS
❏ UNWIND
❏ ROWS
❏ SELECT
Summary
You should now be able to write Cypher statements to:
● Perform intermediate processing with WITH
● Using WITH and UNWIND for query processing
● Perform subqueries with WITH
● Perform subqueries with CALL
250
Controlling Results
Returned
In This Module You’ll Learn ...
How to write Cypher statements to:
● Eliminate duplication in results
● Order results
● Limit the number of results
252
Eliminate Result
Duplication
253
Example with Duplicate
Results
MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks'
RETURN m.title, m.released
Returned 13
records
254
Eliminating
Duplication
Eliminate duplicates
using DISTINCT
MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks'
RETURN DISTINCT m.title, m.released
Returned 12
records
255
Duplication in Lists
MATCH (p:Person)-[:ACTED_IN | DIRECTED | WROTE]->(m:Movie)
WHERE m.released = 2003
RETURN m.title, collect(p.name) AS credits
Duplicates
256
Eliminating Duplication in Lists
MATCH (p:Person)-[:ACTED_IN | DIRECTED | WROTE]->(m:Movie)
WHERE m.released = 2003
RETURN m.title, collect(DISTINCT p.name) AS credits
257
258
WITH and DISTINCT to Eliminate Duplication
MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks'
WITH DISTINCT m
RETURN m.released, m.title
Order Results
259
Ordering Results
MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks' OR p.name = 'Keanu Reeves'
RETURN DISTINCT m.title, m.released ORDER BY m.released DESC
260
Ordering Multiple Results
There is no limit how many times
ORDER BY can be used in a query
MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks' OR p.name = 'Keanu Reeves'
RETURN DISTINCT m.title, m.released
ORDER BY m.released DESC, m.title
Rows sorted by release date
descending, and by title
261
Limiting Results
262
Limiting the Number of Results
MATCH (m:Movie)
RETURN m.title as title, m.released as year
ORDER BY m.released DESC LIMIT 10
Returned 10
records
263
Limiting Number of Intermediate Results
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WITH m, p LIMIT 6
RETURN collect(p.name), m.title
264
Another Example Using LIMIT
Note: This display in Neo4j Browser is
with Connect result nodes unchecked
MATCH (m:Movie)
WITH m LIMIT 5
MATCH path = (m)<-[:ACTED_IN]-(:Person)
WITH m, collect(path) AS paths
RETURN m, paths[0..2]
265
Alternative to LIMIT
An alternative to the code above:
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a, collect(m.title) AS movies
WHERE size(movies) = 5
RETURN a.name, movies
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a, count(*) AS numMovies, collect(m.title) AS movies
WHERE numMovies = 5
RETURN a.name, numMovies, movies
266
Exercise 8: Controlling Results
Returned
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 8
:play 4.0-intro-neo4j-exercises
Note This exercise has 5 steps. Estimated time to complete: 15 minutes
267
Check Your
Understanding
268
Question 1
This code returns the titles of all movies that have been reviewed. Multiple people can review a
movie. How can you change this code so that a movie title will only be returned once?
MATCH (m:Movie)<-[:REVIEWED]-()
RETURN m.title
Select the correct answers:
❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN DISTINCT m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN UNIQUE m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() WITH DISTINCT m RETURN m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() WITH UNIQUE m RETURN m.title
Question 1
This code returns the titles of all movies that have been reviewed. Multiple people can review a
movie. How can you change this code so that a movie title will only be returned once?
MATCH (m:Movie)<-[:REVIEWED]-()
RETURN m.title
Select the correct answers:
❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN DISTINCT m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN UNIQUE m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() WITH DISTINCT m RETURN m.title
❏ MATCH (m:Movie)←[:REVIEWED]-() WITH UNIQUE m RETURN m.title
Question 2
How many property values can you order in the returned result?
Select the correct answer:
❏ One
❏ As many as needed
❏ Two
❏ Three
Question 2
How many property values can you order in the returned result?
Select the correct answer:
❏ One
❏ As many as needed
❏ Two
❏ Three
Question 3
We want to retrieve the names of the five oldest persons in our dataset. What code will do this?
Select the correct answers:
❏ MATCH (p:Person)-[:ACTED_IN]→() WITH p LIMIT 5 RETURN DISTINCT p.name,
p.born ORDER BY p.born
❏ MATCH (p:Person) WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY
p.born
❏ MATCH (p:Person)-[:ACTED_IN]→() RETURN DISTINCT p.name, p.born ORDER BY
p.born LIMIT 5
❏ MATCH (p:Person) RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5
Question 3
We want to retrieve the names of the five oldest persons in our dataset. What code will do this?
Select the correct answers:
❏ MATCH (p:Person)-[:ACTED_IN]→() WITH p LIMIT 5 RETURN DISTINCT p.name,
p.born ORDER BY p.born
❏ MATCH (p:Person) WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY
p.born
❏ MATCH (p:Person)-[:ACTED_IN]→() RETURN DISTINCT p.name, p.born ORDER BY
p.born LIMIT 5
❏ MATCH (p:Person) RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5
Summary
You should now be able to write Cypher statements to :
● Eliminate duplication in results
● Order results
● Limit the number of results
275
Creating Nodes and
Relationships
277
Overview
At the end of this module, you should be able to write Cypher statements to:
● Create a node:
■ Add and remove node labels.
■ Add and remove node properties.
■ Update properties.
● Create a relationship:
■ Add and remove properties for a relationship.
● Delete a node.
● Delete a relationship.
● Merge data in a graph:
■ Create nodes.
■ Create relationships.
Creating a node
278
CREATE (:Movie {title: 'Batman Begins'})
Create a node of type Movie with the title property set to Batman Begins:
CREATE (:Movie:Action {title: 'Batman Begins'})
Create a node of type Movie with the title property set to Batman Begins and return the
node: CREATE (m:Movie {title: 'Batman Begins'})
RETURN m
Create a node of type Movie and Action with the title property set to Batman Begins:
<id> is set
by the graph
engine
Creating multiple nodes
279
CREATE (:Person {name: 'Michael Caine', born: 1933}),
(:Person {name: 'Liam Neeson', born: 1952}),
(:Person {name: 'Katie Holmes', born: 1978}),
(:Person {name: 'Benjamin Melniker', born: 1913})
Create some Person nodes for actors and the director for the movie, Batman Begins:
Important: The graph engine will create a node with the same properties of a node that
already exists. You can prevent this from happening in one of two ways:
1. You can use `MERGE` rather than `CREATE` when creating the node.
2. You can add constraints to your graph. Then an attempt to create “duplicate” node will
result in an error.
Adding a label to a node
280
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m:Action
RETURN labels(m)
Add the Action label to the movie, Batman Begins, return all labels for this node:
Removing a label from a node
281
MATCH (m:Movie:Action)
WHERE m.title = 'Batman Begins'
REMOVE m:Action
RETURN labels(m)
Remove the Action label from the movie, Batman Begins, return all labels for this node:
Adding or updating properties for a node
282
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.released = 2005, m.lengthInMinutes = 140,
m.videoFormat = ’DVD’, m.grossMillions = 206.5
RETURN m
Add the properties released and lengthInMinutes to the movie Batman Begins:
● If property does not exist for the node, it is added with the specified value.
● If property exists for the node, it is updated with the specified value
Removing properties from a node
283
MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.grossMillions = null
REMOVE m.videoFormat
RETURN m
Properties can be removed in one of two ways:
• Set the property value to null
• Use the REMOVE keyword
Remove the grossMillions and
videoFormat properties:
Exercise 9: Creating Nodes
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 9
:play 4.0-intro-neo4j-exercises
Note: This exercise has 18 steps. Estimated time to complete: 40 minutes
284
Creating a relationship
285
MATCH (a:Person), (m:Movie)
WHERE a.name = 'Michael Caine' AND
m.title = 'Batman Begins'
CREATE (a)-[:ACTED_IN]->(m)
RETURN a, m
You create a relationship by:
1. Finding the “from node”.
2. Finding the “to node”.
3. Using CREATE to add the directed relationship between the nodes.
Create the :ACTED_IN relationship between
the Person, Michael Caine and the Movie,
Batman Begins:
Creating multiple relationships
286
MATCH (a:Person), (m:Movie), (p:Person)
WHERE a.name = 'Liam Neeson' AND
m.title = 'Batman Begins' AND
p.name = 'Benjamin Melniker'
CREATE (a)-[:ACTED_IN]->(m)<-[:PRODUCED]-(p)
RETURN a, m, p
Create the :ACTED_IN relationship
between the Person, Liam Neeson and
the Movie, Batman Begins and the
:PRODUCED relationship between the
Person, Benjamin Melniker and same
movie.
Adding properties to relationships
287
MATCH (a:Person), (m:Movie)
WHERE a.name = 'Christian Bale' AND
m.title = 'Batman Begins' AND
NOT exists((a)-[:ACTED_IN]->(m))
CREATE (a)-[rel:ACTED_IN]->(m)
SET rel.roles = ['Bruce Wayne','Batman']
RETURN a, m
Same technique you use for creating and updating node properties.
Add the roles property to the :ACTED_IN
relationship from Christian Bale to
Batman Begins:
Removing properties from relationships
288
MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND
m.title = 'Batman Begins'
REMOVE rel.roles
RETURN a, rel, m
Same technique you use for removing node properties.
Remove the roles property from the
:ACTED_IN relationship from Christian
Bale to Batman Begins:
Exercise 10: Creating
Relationships
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 10
:play 4.0-intro-neo4j-exercises
Note: This exercise has 13 steps. Estimated time to complete: 35 minutes
289
Deleting a relationship
290
MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND
m.title = 'Batman Begins'
DELETE rel
RETURN a, m
Batman Begins relationships: Delete the :ACTED_IN relationship between Christian Bale
and Batman Begins:
After deleting the relationship from
Christian Bale to Batman Begins
291
Batman Begins relationships: Christian Bale relationships:
Deleting a relationship and a node - 1
292
MATCH (p:Person)-[rel:PRODUCED]->(:Movie)
WHERE p.name = 'Benjamin Melniker'
DELETE rel, p
Batman Begins relationships:
Delete the :PRODUCED relationship between Benjamin
Melniker and Batman Begins, as well as the Benjamin
Melniker node:
Deleting a relationship and a node - 2
293
MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DELETE p
Batman Begins relationships:
Attempt to delete Liam Neeson and not his relationships to any
other nodes:
Deleting a relationship and a node - 3
294
MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DETACH DELETE p
Batman Begins relationships: Delete Liam Neeson and his relationships to any other nodes:
Exercise 11: Deleting Nodes and
Relationships
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 11
:play 4.0-intro-neo4j-exercises
Note: This exercise has 6 steps. Estimated time to complete: 20 minutes
295
Using MERGE to create nodes
296
MERGE (a:Actor {name: 'Michael Caine'})
SET a.born=1933
RETURN a
Current Michael Caine Person node: Add a Michael Caine Actor node with a value of 1933 for born using
MERGE. The Actor node is not found so a new node is created:
Resulting Michael Caine nodes:
Important: Only
specify properties
that will have
unique keys when
you merge.
Specifying creation behavior for the merge
297
MERGE (a:Person {name: 'Sir Michael Caine'})
ON CREATE SET a.born = 1934,
a.birthPlace = 'London'
RETURN a
Current Michael Caine nodes:
Add a Sir Michael Caine Person node with a born value of 1934 for born
using MERGE and also set the birthPlace property:
Resulting Michael Caine nodes:
Specifying match behavior for the merge
298
MERGE (a:Person {name: 'Sir Michael Caine'})
ON CREATE SET a.born = 1934,
a.birthPlace = 'UK'
ON MATCH SET a.birthPlace = 'UK'
RETURN a
Current Michael Caine nodes: Add or update the Michael Caine Person node:
Using MERGE to create relationships
299
MATCH (p:Person), (m:Movie)
WHERE m.title = 'Batman Begins' AND p.name ENDS WITH 'Caine'
MERGE (p)-[:ACTED_IN]->(m)
RETURN p, m
Make sure that all Person nodes with a person whose name ends with Caine
are connected to the Movie node, Batman Begins.
Exercise 12: Merging Data in
Graph
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 12
:play 4.0-intro-neo4j-exercises
Note: This exercise has 16 steps. Estimated time to complete: 45 minutes
300
Check your understanding
301
Question 1
302
What Cypher clauses can you use to create a node?
Select the correct answers.
❏ CREATE
❏ CREATE NODE
❏ MERGE
❏ ADD
Answer 1
303
What Cypher clauses can you use to create a node?
Select the correct answers.
✅ CREATE
❏ CREATE NODE
✅ MERGE
❏ ADD
Question 2
304
Suppose that you have retrieved a node, s with a property, color:
What Cypher clause do you use to delete the color property from this node?
Select the correct answers.
❏ DELETE s.color
❏ SET s.color=null
❏ REMOVE s.color
❏ SET s.color=?
Answer 2
305
Suppose that you have retrieved a node, s with a property, color:
What Cypher clause do you use to delete the color property from this node?
Select the correct answers.
❏ DELETE s.color
✅ SET s.color=null
✅ REMOVE s.color
❏ SET s.color=?
Question 3
306
Suppose you retrieve a node, n in the graph that is related to other nodes. What
Cypher clause do you write to delete this node and its relationships in the graph?
Select the correct answers.
❏ DELETE n
❏ DELETE n WITH RELATIONSHIPS
❏ REMOVE n
❏ DETACH DELETE n
Answer 3
307
Suppose you retrieve a node, n in the graph that is related to other nodes. What
Cypher clause do you write to delete this node and its relationships in the graph?
Select the correct answers.
❏ DELETE n
❏ DELETE n WITH RELATIONSHIPS
❏ REMOVE n
✅ DETACH DELETE n
308
Summary
You should be able to write Cypher statements to:
● Create a node:
■ Add and remove node labels.
■ Add and remove node properties.
■ Update properties.
● Create a relationship:
■ Add and remove properties for a relationship.
● Delete a node.
● Delete a relationship.
● Merge data in a graph:
■ Creating nodes.
■ Creating relationships.
Indexes and Constraints
v 1.0
Managing constraints and node keys
310
Automatically control the data that is added to the
graph:
• Uniqueness: Unique values for node properties
• Existence: Required properties for nodes or relationships
Ensuring that a property value for a node
is unique
311
CREATE CONSTRAINT ON (m:Movie) ASSERT m.title IS UNIQUE
Ensure that the title for a node of type Movie is unique:
● This statement will fail if there are any Movie nodes in the graph that have the
same value for the title property.
● This statement will succeed if there are any Movie nodes in the graph that do
not have the title property.
Ensuring uniqueness using the constraint
312
CREATE (:Movie {title: 'The Matrix'})
After creating the constraint, we attempt to create a Movie with the title, The Matrix:
Ensuring that properties exist
313
CREATE CONSTRAINT ON (m:Movie) ASSERT exists(m.tagline)
You can create an constraint that will ensure that when a node or relationship is created or
updated, a particular property must have a value:
This statement failed because the Movie node for
the movie, Something’s Gotta Give does not have
a value for the tagline property.
Creating an exists constraint on a
relationship
314
CREATE CONSTRAINT ON ()-[rel:REVIEWED]-() ASSERT exists(rel.rating)
We know that in the Movie graph, all :REVIEWED relationships currently have a property,
rating. We can create an existence constraint on that property as follows:
Using the exists constraint on a
relationship
315
MATCH (p:Person), (m:Movie)
WHERE p.name = 'Jessica Thompson' AND
m.title = 'The Matrix'
MERGE (p)-[:REVIEWED {summary: 'Great movie!'}]->(m)
After creating this constraint, if we attempt to create a :REVIEWED relationship without
setting the rating property:
Retrieving constraints defined for the graph
316
Note: Adding the method notation for this CALL statement enables you to use the call for
returning results that may be used later in the Cypher statement.
CALL db.constraints()
Dropping constraints
317
DROP CONSTRAINT ON ()-[rel:REVIEWED]-() ASSERT exists(rel.rating)
Creating node keys - 1
318
CREATE CONSTRAINT ON (p:Person) ASSERT (p.name, p.born) IS NODE KEY
• Unique constraint for a set of properties for a node
• Is implemented as an index in the graph
Suppose that in our Movie graph, we will not allow a Person node to be created where both the name and
born properties are the same. We can create a constraint that will be a node key to ensure that this
uniqueness for the set of properties is asserted:
We attempt to create the constraint, but it fails because there is a Person node in the graph that does not
have the born property set:
Creating node keys - 2
319
MATCH (p:Person)
WHERE NOT exists(p.born)
SET p.born = 0
We then ensure that all Person nodes have a value for the born property:
The creation of the node key will now be successful:
Any subsequent attempt to create or modify an existing Person node with name or born values
that violate the uniqueness constraint as a node key will fail:
Using LOAD CSV for Import
In This Module You’ll Learn ...
How to:
● Prepare the graph and data for import
○ Inspect data
○ Determine if data needs to be transformed
○ Determine the size of the data that will be imported
○ Create the Constraints in the graph
● Import the data with LOAD CSV
● Create indexes for newly-loaded data
https://neo4j.com/labs/apoc/4.1/import/
The APOC library adds support for importing data from various data formats,
including JSON, XML, and XLS:
Prepare for Data Import
322
CSV File Structure
Lines in CSV file contain rows of data from a data source
● Commonly this is from a table in an RDBMS
For the CSV file(s) determine:
● Whether the first row contains header information
○ This contains column names for all rows in the file
● What the delimiter between each fields in a row
IDs Must Be Unique
Is the Data Clean?
1. Check for headers that do not match
2. Are quotes used correctly?
3. If an element has no value will an empty string be used?
4. Are UTF-8 prefixes used (for example uc)?
5. Do some fields have trailing spaces?
6. Do the fields contain binary zeros?
7. Understand how lists are formed
● The default is to use colon(:) as the separator
1. Is comma(,) the delimiter?
2. Check for typos
Inspect the Data From a URL
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/v4.0-intro-neo4j/people.csv'
AS line
RETURN line LIMIT 10
Example: Inspect the Data Stored Locally
LOAD CSV WITH HEADERS
FROM 'file:///people.csv'
AS line
RETURN line LIMIT 10
Determine if Data Needs Transformation
● toInteger()
● toFloat()
For example,
transform these field
values to numbers as
shown here:
Preview the Data Transformation
LOAD CSV WITH HEADERS
FROM 'file:///movies1.csv'
AS line
RETURN toFloat(line.avgVote), line.genres, toInteger(line.movieId),
line.title, toInteger(line.releaseYear) LIMIT 10
Transforming Lists
LOAD CSV WITH HEADERS
FROM 'file:///movies1.csv'
AS line
RETURN toFloat(line.avgVote), split(coalesce(line.genres,""), ":"),
toInteger(line.movieId), line.title, toInteger(line.releaseYear)
LIMIT 10
Create Constraints Before Loading the Data
CREATE CONSTRAINT UniqueMovieIdConstraint ON (m:Movie) ASSERT m.id IS UNIQUE;
CREATE CONSTRAINT UniquePersonIdConstraint ON (p:Person) ASSERT p.id IS UNIQUE
Determine Size of the Data to be Loaded
LOAD CSV WITH HEADERS
FROM 'file:///people.csv'
AS line
RETURN count(line)
Loading a Large CSV File
Two options for loading data when number of rows exceeds 100K:
1. USING PERIODIC COMMIT LOAD CSV
2. Use the APOC library
https://neo4j.com/labs/apoc/4.2/graph-updates/periodic-execution/
Helpful Links:
APOC:
Apoc.periodic.iterate:
https://neo4j.com/labs/apoc/4.2/
CSV Data Import
334
Importing Nodes
:auto USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM
'https://data.neo4j.com/v4.0-intro-neo4j/movies1.csv' AS row
MERGE (m:Movie {id:toInteger(row.movieId)})
ON CREATE SET
m.title = row.title,
m.avgVote = toFloat(row.avgVote),
m.releaseYear = toInteger(row.releaseYear),
m.genres = split(row.genres,":")
More on USING PERIODIC COMMIT -
https://neo4j.com/developer/guide-import-csv/#_important_tips_for_load_csv
Importing Relationships
LOAD CSV WITH HEADERS FROM
'https://data.neo4j.com/v4.0-intro-neo4j/directors.csv' AS row
MATCH (movie:Movie {id:toInteger(row.movieId)})
MATCH (person:Person {id: toInteger(row.personId)})
MERGE (person)-[:DIRECTED]->(movie)
ON CREATE SET person:Director
Create Indexes
337
Add Indexes
// Do this only after ALL data has been imported
CREATE INDEX MovieTitleIndex ON (m:Movie) FOR (m.title);
CREATE INDEX PersonNameIndex ON (p:Person) FOR (p.name)
Exercise 16: LOAD CSV for Import
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 16
:play 4.0-intro-neo4j-exercises
Note This exercise has 9 steps. Estimated time to complete: 30 minutes
Check Your
Understanding
340
Question 1
When you execute LOAD CSV what unit of data is read from the data source?
Select the correct answer:
❏ A field
❏ All field values for a single field
❏ A row
❏ A table
Question 1
When you execute LOAD CSV what unit of data is read from the data source?
Select the correct answer:
❏ A field
❏ All field values for a single field
❏ A row
❏ A table
Question 2
What should you add to the graph before you import using LOAD CSV?
Select the correct answer:
❏ Indexes for all important queries
❏ Schema containing the names node labels that will be created
❏ Schema containing the types that will be assigned to properties during the load
❏ Uniqueness constraints
Question 2
What should you add to the graph before you import using LOAD CSV?
Select the correct answer:
❏ Indexes for all important queries
❏ Schema containing the names node labels that will be created
❏ Schema containing the types that will be assigned to properties during the load
❏ Uniqueness constraints
Question 3
In general, what is the maximum rows you can process using LOAD CSV?
Select the correct answer:
❏ 1K
❏ 10K
❏ 100K
❏ 1M
Question 3
In general, what is the maximum rows you can process using LOAD CSV?
Select the correct answer:
❏ 1K
❏ 10K
❏ 100K
❏ 1M
Summary
You should now be able to:
● Describe the steps for importing data with Cypher
● Prepare the graph and data for import
● Import the data with LOAD CSV
● Create indexes for newly-loaded data
347
Using Query Best Practices
In This Module You’ll Learn ...
How to:
● Use parameters in your Cypher statements
● Analyze Cypher execution
● Monitor queries
349
Cypher Parameters
350
Cypher Parameters
● Most deployed applications that use Neo4j have client code
written in other languages
○ For example: using Java, Javascript, Python, and others
● In a deployed applications in almost all cases values are not hard
code in Cypher statements
● Cypher parameters are used to pass values to Cyper statements
351
Using Cypher Parameters
In Cypher, parameter names begin with $
352
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
RETURN m.released, m.title ORDER BY m.released DESC
At runtime, the value of $actorName is used in the Cypher statement
Setting a Parameter
353
:param actorName => 'Tom Hanks'
Using the Parameter
354
MATCH
(p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
RETURN m.released, m.title
ORDER BY m.released DESC
:param actorName => 'Tom Cruise'
Setting Multiple Parameters
355
:params {actorName: 'Tom Cruise', movieName: 'Top Gun'}
Using
Multiple
Parameters
356
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName AND m.title = $movieName
RETURN p, m
Viewing Parameters
357
:params
Clearing All
Parameters
358
:params {}
Analyzing Queries
359
Analyzing Queries
There two ways to analyze Cypher queries
● This is done by prefixing either EXPLAIN or PROFILE to the query
EXPLAIN
● Provides estimates of the graph engine processing
● It does not execute the Cypher statement
PROFILE
● The graph engine executes the the query
● Provides profiling information based on what occurred during execution
360
Analysis Using EXPLAIN
Explain Returns a Cypher query plan
A Cypher query plan shows what is expected
● Operations
● Where rows are processed
● What rows are passed on to the the next operation (step)
Evaluating and comparing Cypher statements
● Use to understand the stages of processing that will occur when the
Cypher executes
361
Setting Parameters
362
:params {actorName: 'Hugo Weaving', year: 2000}
Using EXPLAIN
363
EXPLAIN MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
AND m.released < $year
RETURN p.name, m.title, m.released
Expanding the Steps
364
EXPLAIN MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
AND m.released < $year
RETURN p.name, m.title, m.released
Showing all steps:
Using PROFILE
365
PROFILE MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
AND m.released < $year
RETURN p.name, m.title, m.released
Showing all steps
expanded
Expanding PROFILE Steps
366
PROFILE MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = $actorName
AND m.released < $year
RETURN p.name, m.title, m.released
PROFILE Without
Node Labels
367
PROFILE MATCH (p)-[:ACTED_IN]->(m)
WHERE p.name = $actorName
AND m.released < $year
RETURN p.name, m.title, m.released
Query changed
● With Labels:
(p:Person)-[:ACTED_IN]->(m:Movie)
● No Labels:
(p)-[:ACTED_IN]->(m)
No Labels
With Labels
Monitoring Queries
368
Monitoring Queries
Causes for long running Cypher queries:
● The query returns a large amount of data
○ Although the query completed execution in the graph engine,
it is still creating the result stream
● Query execution takes a long time to complete processing
369
MATCH (a), (b), (c), (d), (e)
RETURN count(id(a))
Example B:
MATCH (a)--(b)--(c)--(d)--(e)--(f)--(g)
RETURN a
Example A:
Killing a
Query
370
Kill the query by
closing the
result pane
Monitoring Queries
● :queries command
371
Browser with long
running query
Browser opened to
monitor the query
Monitoring Long-running Query
372
Killing a Long-running Query
The :queries command is only available in Neo4j Enterprise Edition
373
Some Best Practices
374
Cypher Query Best Practices
● Indexes: Create an use indexes effectively
● Parameters: Use parameters rather than literals in queries
● Labels: Specify node labels in MATCH clauses
● Rows:
○ Reduce the number of rows passed and processed
○ Reduce the rows processed by using DISTINCT and LIMIT early in query
● Aggregate: Early in the query, rather than in the RETURN clause
● Properties: Defer property access until it is needed
375
Exercise 15: Using Query Best
Practices
In the query edit pane of Neo4j Browser, execute the
browser command:
and follow the instructions for Exercise 15
:play 4.0-intro-neo4j-exercises
Note This exercise has 14 steps. Estimated time to complete: 30 minutes
376
Check Your
Understanding
377
Question 1
What Cypher keyword can you use to prefix any Cypher statement to examine how
many db hits occurred when the statement executed?
Select the correct answer:
❏ ANALYZE
❏ EXPLAIN
❏ PROFILE
❏ MONITOR
Question 1
What Cypher keyword can you use to prefix any Cypher statement to examine how
many db hits occurred when the statement executed?
Select the correct answer:
❏ ANALYZE
❏ EXPLAIN
❏ PROFILE
❏ MONITOR
Question 2
What commands do you use to set values for parameters in your Neo4j Browser
session?
Select the correct answers:
❏ :set param
❏ :param
❏ :set params
❏ :params
Question 2
What commands do you use to set values for parameters in your Neo4j Browser
session?
Select the correct answers:
❏ :set param
❏ :param
❏ :set params
❏ :params
Question 3
Suppose you are executing queries in Neo4j Browser Session A and monitoring them
in Neo4j Browser Session B with the :queries command. What are some ways that
you can kill a query?
Select the correct answers:
❏ You can close the result pane in Session A, if the query can be seen in Session B
❏ You can close the result pane in Session A, if the query can no longer be seen in Session B
❏ You can kill any running query seen in Session B
❏ You can close the Neo4j Browser that is running Session A
Question 3
Suppose you are executing queries in Neo4j Browser Session A and monitoring them
in Neo4j Browser Session B with the :queries command. What are some ways that
you can kill a query?
Select the correct answers:
❏ You can close the result pane in Session A, if the query can be seen in Session B
❏ You can close the result pane in Session A, if the query can no longer be seen in Session B
❏ You can kill any running query seen in Session B
❏ You can close the Neo4j Browser that is running Session A
Summary
You should now be able to:
● Use parameters in your Cypher statements
● Analyze Cypher execution
● Monitor queries
384
Thank You!
386
387
Accessing Neo4j resources
There are many ways that you can learn more about Neo4j. A
good starting point for learning about the resources available to
you is the Neo4j Learning Resources page at
https://neo4j.com/developer/resources/.

5.17 - IntroductionToNeo4j-allSlides_1_2022_DanMc.pdf

  • 1.
    Welcome Everyone! Introduction to Neo4j 2022.5.17 Your Instructors: DanMcNamara & Syd Beckett To Do if Not Done Already: Install Neo4j Desktop from neo4j.com/download Install Neo4j Aura from https://neo4j.com/cloud/aura/ - If local desktop install is problematic - Create Sandbox on neo4j.com/sandbox
  • 2.
    Today’s Instructors Dan McNamara SydBeckett Solution Engineers at Neo4j Inc. dan.mcnamara@neo4j.com syd.beckett@neo4j.com 2
  • 3.
    3 Our Plan forToday Agenda • Neo4j Platform Overview • Installation/Setup • Intro to Cypher ✔ w/ Exercises Objectives & Outcomes • Install and run Neo4j locally • Learn Cypher ✔ Creating/Updating Graphs ✔ Pattern Matching ✔ Aggregations ✔ Creating nodes/relationships ✔ Loading data from files • Know where to go next Breaks/Lunch • 2 Breaks (15 minutes) ✔ 10:30ish ✔ 2:30ish • Lunch ✔ 1 hour 12:00-1:00ish
  • 4.
    4 Prep Items To Doif Not Done Already: Install Neo4j Desktop from neo4j.com/download Install Neo4j Aura from https://neo4j.com/cloud/aura/ - If local desktop or Aura install is problematic - • Create Sandbox on neo4j.com/sandbox Helpful Links: Neo4j Developer Materials: https://neo4j.com/developer/ Cypher RefCard: https://neo4j.com/docs/cypher-refcard/current/
  • 5.
  • 6.
    6 What’s the pointof graphs? A graph lets us model the real world to answer tough questions about how things are connected, especially in ways that may not be obvious! Seven Bridges of Konigsberg problem. Leonhard Euler, 1735
  • 7.
    7 What is GraphTheory? In mathematics, Graph Theory is the study of graphs, which are mathematical structures used to model relationships between concepts. More intuitively: Graph Theory is the study of relationships.
  • 8.
    8 What is aGraph? G = (V, E) V: a set of vertices E: a set of edges, where an edge is a pair of vertices V = {1,2,3,4,5,6} E = { {1,2},{2,3},{2,6},{2,6},{3,5},{3,4},{5,4},{6,5} } 1 2 3 4 5 6
  • 9.
    Traversal is theprocess of following a sequence of edges that link adjacent vertices 9 What is Traversal? 1 2 3 4 5 6 w = ( {1,2},{2,3} )
  • 10.
    Graphs Are Everywhere! 10 TheInternet H O H Chemistry ActiveDirectory & LDAP Public Transit & Supply Chains Social Networks
  • 11.
  • 12.
    12 Harnessing Connections DrivesBusiness Value Enhanced Decision Making Hyper Personalization Massive Data Integration Data Driven Discovery & Innovation Product Recommendations Personalized Health Care Media and Advertising Fraud Prevention Network Analysis Law Enforcement Drug Discovery Intelligence and Crime Detection Product & Process Innovation 360 view of customer Compliance Optimize Operations Data Science AI & ML Fraud Prediction Patient Journey Customer Disambiguation Transforming Industries
  • 13.
    Modern Graph TheoryApplications 13 Real-Time Recommendations Fraud Detection Network & IT Operations Master Data Management Knowledge Graph Identity & Access Management https://neo4j.com/use-cases/ https://neo4j.com/sandbox/ https://neo4j.com/graphgists/
  • 14.
    Data connections hasbecome the foundation of business technology & created industry leaders
  • 15.
    15 Relationships in RDBMS ●Require foreign keys, and possibly a lookup table ● Traversing a foreign key requires an index lookup The purpose of graphs is to do rapid traversal. The RDBMS model is too expensive for that. Person ID Name 1 Anne 2 James 3 Alex Address ID Country 1 Germany 2 USA Lookup Person Address 1 2 2 2 3 1
  • 16.
    Joins are executedevery time you query the relationship Executing a Join means to search for a key B-Tree Index: O(log(n)) Your data grows by 10x, your time goes up by one step on each Join More Data = More Searches Slower Performance The Problem 1 2 3 4
  • 17.
    Relational Databases can’thandle Relationships Degraded Performance Speed plummets as data grows and as the number of joins grows Wrong Language SQL was built with Set Theory in mind, not Graph Theory Not Flexible New types of data and relationships require schema redesign Wrong Model They cannot model or store relationships without complexity 1 2 3 4
  • 18.
    Relationships in RDBMSvs Graph MATCH (sub)-[:REPORTS_TO*0..3]->(boss), (report)-[:REPORTS_TO*1..3]->(sub) WHERE boss.name = 'John Doe' RETURN sub.name AS Subordinate, count(report) AS Total Find all direct reports and how many people they manage, up to 3 levels down Graph DB Query (using Cypher Query Language) SQL Query 18 Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
  • 19.
    NoSQL Databases can’thandle Relationships Degraded Performance Speed plummets as you try to join data together in the application Wrong Languages Lots of odd “almost sql” languages terrible at “joins” Not ACID No support for transactions Wrong Model They cannot model or store relationships without complexity 1 2 3 4
  • 20.
  • 21.
    21 Graph Databases: Designedfor Connected Data RELATIONAL DATABASES Store and retrieve data NoSQL DATABASES Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer From Disparate Silos To Cross-Silo Connections
  • 22.
  • 23.
    23 In This ModuleYou’ll Learn ... At the end of this module, you should be able to: ● Describe the components and benefits of the Neo4j.
  • 24.
    Connections in Dataare as Valuable as the Data Itself Networks of People Transaction Networks Bought B ou gh t V i e w e d R e t u r n e d Bought Knowledge Networks Pl ay s Lives_in In_sport Likes F a n _ o f Plays_for E.g., Risk management, Supply chain, Payments E.g., Employees, Customers, Suppliers, Partners, Influencers E.g., Enterprise content, Domain specific content, eCommerce content K n o w s Knows Knows K n o w s
  • 25.
    Neo4j - TheGraph Company 750+ 7 of 10 20 of 25 7 of 10 53K+ 100+ 300+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners •Founders wrote the book on Graph •Now wrote the book on Graph Algorithms •Creator of the Neo4j Graph Platform •~350 employees •HQ in Silicon Valley, other offices include Boston, London, Munich, Paris and Malmö •Market: Neo4j is the clear leader. More customers and usage than all other Graph products combined (DB-Engines) Ecosystem SMB building products based on Neo4j Enterprise customers Partners Meetup members Events per year Industry’s Largest Dedicated Investment in Graphs 8 of 10 Top Insurance Providers
  • 26.
    26 Harnessing Connections DrivesBusiness Value Enhanced Decision Making Hyper Personalization Massive Data Integration Data Driven Discovery & Innovation Product Recommendations Personalized Health Care Media and Advertising Fraud Prevention Network Analysis Law Enforcement Drug Discovery Intelligence and Crime Detection Product & Process Innovation 360 view of customer, vendor, product, etc. Compliance Optimize Operations Connected Data at the Center AI & Machine Learning Price optimization Product Recommendations Resource allocation Digital Transformation Megatrends
  • 27.
    Neo4j – Re-ImagineYour Data as a Graph Neo4j is an enterprise-grade graph database that enables you to: •Model and store your data as a graph •Query data relationships with ease and in real-time •Seamlessly evolve applications to support new requirements by adding new kinds of data and relationships ● Agile development ● High performance ● Vertical and horizontal scale ● Seamless evolution
  • 28.
    28 Store and applygranular access control to the most sensitive data Designed for Enterprise-Grade Workloads Find insights and connections across Billions of nodes Scalability Security Flexibility Expand your graph database to multiple use cases
  • 29.
    Native Storage andProcessing Index Free Adjacency Neo4j disk and memory structures link data directly, allowing millions graph traversals per second per core. Graph data and paths between data do not have to be pre-defined before they can be used. 29
  • 30.
    Transactional consistency -all updates either succeed or fail. 30 Neo4j Database ACID Transactions ● Atomicity ● Consistency ● Isolation ● Durability ACID Consistency Non-ACID Graph DBMs (NoSQL)
  • 31.
    Property Graph -Simply Powerful Employee City Company Nodes represent objects (nouns) Relationships are directional Relationships connect nodes are represent actions (verbs) Relationships can have properties (name/value pairs) Nodes can have properties (name/value pairs) name: Amy Peters date_of_birth: 1984-03-01 employee_ID: 1 :HAS_CEO start_date: 2008-01-20 :LOCATED_IN
  • 32.
    Modeling relational tograph 32 In some ways they’re similar: Relational Graph Rows Nodes Joins Relationships Table names Labels Columns Properties In some ways they’re not: Relational Graph Each column must have a field value. Nodes with the same label aren't required to have the same set of properties. Joins are calculated at query time. Relationships are stored on disk when they are created. A row can belong to one table. A node can have many labels.
  • 33.
    How we model:RDBMS vs graph 33 Relational Graph Try and get the schema defined and then make minimal changes to it after that. It's common for the schema to evolve with the application. More abstract focus when modeling. i.e. Focus on classes rather than objects. Common to use actual data items when modeling.
  • 34.
    RDBMS vs graphmodels 34 players id name position clubs id name country transfers id fee player_age player_id from_club_id to_club_id season
  • 35.
    RDBMS Vocabulary Mappedto Graph Modeling Relational DB Construct Graph DB Construct Entity table Node labels Row Node Columns Node properties Technical primary keys Replace with business primary keys Constraints Unique constraints for business keys Indexes Indexes on any property Foreign keys Relationships Default values Not required De-normalized or duplicated data Create separate nodes Join tables Relationships Join table columns Relationship properties
  • 36.
  • 37.
    Neo4j’s Property Graph •Nodes • Relationships • Labels • Properties 37
  • 38.
    Neo4j’s Property Graph •Node = Vertex • Relationship = Edge 38
  • 39.
    Neo4j’s Property Graph Nodes •Represent objects or entities • Can be labeled Car Person Person Person 39
  • 40.
    Neo4j’s Property Graph Nodes •Represent objects or entities • Can be labeled • May have properties name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” year: 2010 Car Person Person 40
  • 41.
    DRIVES LOVES O W N S Neo4j’s Property Graph Relationships •Must have a type • Must have a direction name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” year: 2010 Car Person Person 41
  • 42.
    DRIVES LOVES O W N S Neo4j’s Property Graph Relationships •Must have a type • Must have a direction • May have properties name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” year: 2010 Car Person Person since: 2018-10-1 42
  • 43.
    LOVES LIVES WITH DRIVES LOVES O W N S Neo4j’s PropertyGraph Relationships • Must have a type • Must have a direction • May have properties • Nodes can share multiple relationships name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” year: 2010 Car Person Person since: 2018-10-1 43
  • 44.
    LOVES LIVES WITH DRIVES LOVES Neo4j’s PropertyGraph Relationships • Must have a type • Must have a direction • May have properties • Nodes can share multiple relationships since: 2018-10-1 44 O W N S Car Person Person Person name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 brand: “Volvo” model: “V70” year: 2010
  • 45.
  • 46.
    46 Neo4j Graph Platform TheNeo4j Graph Platform includes components that enable you to develop your graph-enabled application. To better understand the Neo4j Graph Platform, you will learn about these components and the benefits they provide. The heart of the Neo4j Graph Platform is the Neo4j Database.
  • 47.
    47 Neo4j DBMS: Clusters Neo4jcluster support • ACID across all locations • Available in Neo4j Enterprise Edition Clusters provide: • High availability • Scalability • For read access to data • Failover • A vital requirement for many enterprises
  • 48.
    Develop Applications Fasterand Easier Official Language Drivers •Foundational drivers for popular programming languages •Bolt: streaming binary wire protocol •Authoritative mapping to native type system, uniform across drivers •Pluggable into richer frameworks 48 JavaScript Java .NET Python Community Drivers Drivers Bolt Neo4j Advantage – Developer productivity Go
  • 49.
    49 Libraries Out-of-the-box: • Awesome Procedureson Cypher (APOC) • Graph Data Science • GraphQL Neo4j community has contributed many specialized libraries also.
  • 50.
    50 Tools • Neo4j Desktop* • Neo4j Browser * • Neo4j Cypher-shell • Neo4j Bloom • Neo4j ETL Tool • Neo4j Graph Algorithms • Neo4j BI-Connector Neo4j community has contributed many specialized tools also.
  • 51.
    Neo4j Desktop: UIfor developers & DB management Supports “plugins” • Neo4j official plugins • Neo4j labs plugins • 3rd party plugins • Bloom plugin for use with local databases managed by desktop only Allows you to manage local databases • Create, stop, start, manage • Add apoc procs, etc. • See log files, configuration, etc. Allows you to connect to remote databases • You can’t manage – but you can open browser Supports organization via “projects”
  • 52.
    Neo4j Browser In reality •Light weight web/javascript application Purpose • Cypher coding • Quick/small visualizations • Exporting result sets Limitations – only one at a time Available via your favorite web browser • Same bolt protocol & UI • Easy way to bypass the above limitation https://www.youtube.com/watch?v=oHo-lQ79zf0&feature=youtu.be
  • 53.
    Neo4j Bloom UserInterface 53 Search with type-ahead suggestions Category icons and color scheme Visualize, Explore and Discover Pan, Zoom and Select Property Browser and editor
  • 54.
  • 55.
    Graph Algorithms inNeo4j +4 5 neo4j.com/ graph-algorithms- book/ Pathfinding & Search Centrality / Importance Community Detection Link Prediction Finds optimal paths or evaluates route availability and quality Determines the importance of distinct nodes in the network Detects group clustering or partition options Evaluates how alike nodes are Estimates the likelihood of nodes forming a future relationship Similarity
  • 56.
    Visually Recognizing PatternsBelieve it or not… …the starting node was not the one in the center …the “bridging” entity resolution nodes between clusters were unexpected
  • 57.
    Sometimes it issimple to see…. (1..3 hops) MATCH p=(ah1:BusinessCustomer:AccountHolder)-[:MAKES_PAYMENTS_TO*1..3]->(ah2:BusinessCustomer:AccountHolder) WHERE ah1.accountName="Lang and Sons" AND ah2.accountName="Klein, Johnston and Glover" RETURN p LIMIT 500
  • 58.
    …other times itis chaos….(1..4 hops) MATCH p=(ah1:BusinessCustomer:AccountHolder)-[:MAKES_PAYMENTS_TO*1..4]->(ah2:BusinessCustomer:AccountHolder) WHERE ah1.accountName="Lang and Sons" AND ah2.accountName="Klein, Johnston and Glover" RETURN p LIMIT 500
  • 59.
    It’s a matterof scale…..
  • 60.
    The point? Visualization toolscan help…. • …but with any volume, attempting to recognize patterns visually is quickly overwhelming This is where graph algorithms come in • Entity resolution 🡪 disambiguation 🡪 similarities, link prediction • Fraud networks 🡪 community detection, centrality • Payment chaining 🡪 community detection, centrality, pathfinding/search Use the results to re-visualize • Set node size/colors based on graph algo weights/scores Community Detection Detects group clustering or partition options. Centrality / Importance Determines the importance of distinct nodes Measures node similarity based on neighbors and relationships. Similarity Pathfinding & Search Finds optimal paths or route availability and quality. Link Prediction Estimates the likelihood of nodes forming a future relationship.
  • 61.
    ● Download at:https://kettle.be ○ Make sure to install Java 8 ● Cross-platform drag-and-drop ETL Workbench GUI ○ No coding required! ● Includes server components for scheduling and running complex jobs ○ Scales from local desktop use to production server cluster 61 Kettle - An ETL Platform that Speaks Neo4j
  • 62.
    • Best live,seamless integration of graph data with your favorite BI tools • Familiar UI for end users • No development effort for IT • Democratizing access to Neo4j data • Free to adopt by BI teams of enterprise edition customers 62 Neo4j BI Connector The most popular BI tools can now talk live to the world’s most popular graph db Tableau JDBC Neo4j BI Connector SQL Cypher Business/Data Analyst Investigator Data Scientist
  • 63.
  • 64.
    What Else isOnline for help with Neo4j
  • 65.
  • 66.
    Question 1 66 What aresome of the benefits provided by the Neo4j Graph Platform? Select the correct answers. ❏ Database clustering ❏ ACID ❏ Index free adjacency ❏ Optimized graph engine
  • 67.
    Answer 1 67 What aresome of the benefits provided by the Neo4j Graph Platform? Select the correct answers. ✅ Database clustering ✅ ACID ✅ Index free adjacency ✅ Optimized graph engine
  • 68.
    Question 2 68 What librariesare included with Neo4j Graph Platform? Select the correct answers. ❏ APOC ❏ JGraph ❏ Graph Data Science ❏ GraphQL
  • 69.
    Answer 2 69 What librariesare included with Neo4j Graph Platform? Select the correct answers. ✅ APOC ❏ JGraph ✅ Graph Data Science ✅ GraphQL
  • 70.
    Question 3 70 What aresome of the language drivers that come with Neo4j out of the box? Select the correct answers. ❏ Java ❏ Ruby ❏ Python ❏ JavaScript
  • 71.
    Answer 3 71 What aresome of the language drivers that come with Neo4j out of the box? Select the correct answers. ✅ Java ❏ Ruby ✅ Python ✅ JavaScript
  • 72.
    72 Summary You should beable to: ● Describe the components and benefits of the Neo4j Graph Platform.
  • 73.
    Getting around inNeo4j Desktop & Browser Note: Much of this was covered in videos in the instructions sent before class, consequently, we are going to cover this quite quickly
  • 74.
    74 Overview At the endof this module, you should be able to: ● Start using Neo4j Desktop / Neo4j Sandbox ● Start using Neo4j Browser
  • 75.
    Neo4j Desktop 75 • Fullfeatured Neo4j Enterprise Edition • Single user license • Runs on your laptop or desktop computer • 4-core max • Includes Browser • Includes Free Bloom Visualization License
  • 76.
    76 Neo4j Sandbox • Webbrowser access to Neo4j Database Server and Neo4j Database in the cloud • Comes with a blank or pre-populated database • Temporary access - Instance lives for up to ten days • No need to install Neo4j on your machine
  • 77.
    https://neo4j.com/cloud/aura/ Neo4j Aura • Databaseas a Service • Various configurations to choose from. • Scale up or down • Pay only for the amount of time you use it. • Runs in the cloud. No need to install Neo4j on your machine. • Includes Bloom Visualization tool
  • 78.
    Neo4j Desktop: UIfor developers & DB management Supports “plugins” • Neo4j official plugins • Neo4j labs plugins • 3rd party plugins • Bloom plugin for use with local databases managed by desktop only Allows you to manage local databases • Create, stop, start, manage • Add apoc procs, etc. • See log files, configuration, etc. Allows you to connect to remote databases • You can’t manage – but you can open browser Supports organization via “projects”
  • 79.
    Neo4j Desktop (1.4.1) 79 ProjectFolders Create Graph (database) Start Browser (same as http://localhost:7474) Add Plugin • Graph Algorithms • APOC’s (procedures) • GraphQL
  • 80.
    License keys License keys •Select “Add software key” • Copy/paste link • Only manages license keys for local database instances managed by Neo4j desktop
  • 81.
    • Accessed throughthe Desktop or Web Browser (localhost:7474) Neo4j Browser 101 81 $ Enter Queries / Commands Here Desktop Web Browser Start the browser Enter Queries
  • 82.
    Display Options • Changenode colors • Change which node property is displayed • Double-click a node and see what happens! Query Editing • Use :clear to clear past results • with (CMD) ⌘ + Arrow / (CTRL) ^ + Arrow to scroll through past queries • Other useful commands :history :clear :help • Run queries with (CMD) ⌘ + Enter / (CTRL) ^ + Enter • Insert new line with SHIFT + Enter • Expand the query bar with ESC Neo4j Browser 101 82 https://neo4j.com/developer/guide-neo4j-browser/
  • 83.
    Where to getsyntax help in Browser 83
  • 84.
    Set browser formulti-statement Click settings (gear) in lower left pane Select “Enable multi-statement query editor” • Many customers keep constraints, etc. in scripts with ;’s – this option allows you to execute the scripts without error • One note of caution is that if you actually issue multiple-statements, you can only see the completion state – not the results as normal. Note: “Connect Result Nodes” • This is extremely useful – BUT – it comes at a cost in that after a query executes, desktop issues a plethora of queries to find all the connections between the nodes even when not mentioned in the query. ✔ Between this and rendering the graph with auto-layout, this is the reason it seems that Neo4j Browser can take 3 minutes to return a query that supposedly runs in 50ms • For long running complex statements in which the result returns the desired connections and you don’t care to see others, de-select this option. • Keep selected for this class
  • 85.
    85 Neo4j Desktop • Createlocal databases • Manage multiple projects • Manage Database Server • Start Neo4j Browser instances • Install plugins (libraries) for use with a project • OS X, Linux, Windows
  • 86.
  • 87.
    In This ModuleYou’ll Learn ... How to write Cypher statements to ... ● Retrieve nodes from the graph ● Filter nodes retrieved using labels and node property values ● Retrieve node property values ● Filter retrieved nodes using relationships Additional information is available from these sources: ● Neo4j Cypher Manual (https://neo4j.com/docs/) ● Cypher Reference card (https://neo4j.com/docs/cypher-refcard/current/) 87
  • 88.
    A pattern matchingquery language made for graphs • Declarative & Expressive (what to find, not how to find it) • Pattern Matching 88  The Cypher Query Language (:Person {name:'Dan'})-[:LOVES]->(:Person {name:'Ann'}) -[:OWNS]->(:Car {brand:'Volvo'}) (anything)-[:DRIVES]->(something) Express Graph Patterns with ASCII ART ¯_(ツ)_/¯
  • 89.
    Cypher Query Language– the MATCH clause LOVES Dan Ann NODE PROPERTY LABEL RELATIONSHIP ( ) MATCH – Return Data RETURN -[ :LOVES ]-> MATCH( ) :Person) { name:"Dan"} ) n r x Variables n,r,x Dan Ann LOVES Car LOVES
  • 90.
  • 91.
  • 92.
    Comments in Cypher //anonymous node not be referenced later in the query () (p) // variable p, a reference to a node used later (:Person) // anonymous node of type Person (p:Person) // p, a reference to a node of type Person // p, a reference to a node of types Actor and Director (p:Actor:Director) 92
  • 93.
  • 94.
    Viewing the DataModel CALL db.schema.visualization 94
  • 95.
    MATCH and RETURN Syntaxexamples for a query: MATCH (variable) RETURN variable MATCH (variable:Label) RETURN variable Retrieve all nodes: MATCH (n) // returns all nodes in the graph RETURN n 95
  • 96.
    Retrieve all PersonNodes MATCH (p:Person) // returns all Person nodes in the graph RETURN p
  • 97.
    Viewing Nodes asTable Data 97
  • 98.
    Exercise 1: RetrievingNodes In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 1 :play 4.0-intro-neo4j-exercises Note This exercise has 4 steps. Estimated time to complete: 10 minutes 98
  • 99.
  • 100.
    Filtering Query by YearBorn MATCH (p:Person {born: 1970}) RETURN p Property filter 100
  • 101.
    Filtering Query byMultiple Properties MATCH (m:Movie {released: 2003, tagline: 'Free your mind'}) RETURN m Multiple property filters 101
  • 102.
    Returning Property Values MATCH(p:Person {born: 1965}) RETURN p.name, p.born Property values 102
  • 103.
    103 Specifying Aliases forColumn Headings MATCH (p:Person {born: 1965}) RETURN p.name AS name, p.born AS `birth year` Column headings
  • 104.
    Exercise 2: FilteringQueries Using Property Values In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 2 :play 4.0-intro-neo4j-exercises Note This exercise has 6 steps. Estimated time to complete: 15 minutes 104
  • 105.
  • 106.
    Relationships ● Directed connectionbetween two nodes ● Relationships have a type (name) ● Relationships can have properties, just like nodes ● Relationships are key to traversing a graph 106
  • 107.
    107 Anonymous nodes/relationships Named node/relationship •Node label or relationship type is specified Anonymous node/relationship • Node/relationships are not specified – “empty” placeholders in cypher () // a node...any node ()--() // 2 nodes have some type of relationship (any direction) ()-->() // the first node has a relationship to the second node ()-[]->() // same as above ()<--() // the second node has a relationship to the first node ()<-[]-() // ditto
  • 108.
    108 Querying using relationships PersonPerson Location Residence MARRIED LIVES_AT LIVES_AT OWNS MATCH (p:Person)-[:LIVES_AT]->(h:Residence) RETURN p.name, h.address MATCH (p:Person)--(h:Residence) // any relationship RETURN p.name, h.address When using a “named” relationship, Neo4j can quickly traverse only those relationships and test if opposite node is the correct label When using an “anonymous” relationship, Neo4j has to traverse every relationship and then inspect every node to see if the desired label – obviously may be slower – but increases flexibility
  • 109.
    Using a Relationshipin a Query Find all people who acted in the movie ‘The Matrix’, and return the nodes and relationships MATCH (p:Person)-[rel:ACTED_IN]->(m:Movie {title: 'The Matrix'}) RETURN p, rel, m Relationship 109
  • 110.
    Querying Using Multiple Relationships Findall movies that Tom Hanks acted in or directed and return the titles of the movies MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN|DIRECTED]->(m:Movie) RETURN p.name, m.title Multiple Relationships 110
  • 111.
    No node variablespecified here Using Anonymous Nodes in a Query MATCH (p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'}) RETURN p.name Find all people who acted in the movie ‘The Matrix’ and return their names 111
  • 112.
    Using an Anonymous Relationship fora Query MATCH (p:Person)-->(m:Movie {title: 'The Matrix'}) RETURN p, m Find all people who have any type of relationship to the movie ‘The Matrix’, and return the nodes and relationships Anonymous Relationship 112
  • 113.
    More Anonymous Relationships Itis recommended that empty brackets [ ] not be used MATCH (p:Person)--(m:Movie {title: 'The Matrix'}) RETURN p, m MATCH (m:Movie)<--(p:Person {name: 'Keanu Reeves'}) RETURN p, m MATCH (p:Person)-[]-(m:Movie {title: 'The Matrix'}) RETURN p, m 113
  • 114.
    Retrieving the RelationshipTypes There is a built-in function, type() that returns the type of a relationship MATCH (p:Person)-[rel]->(:Movie {title:'The Matrix'}) RETURN p.name, type(rel) type() function 114
  • 115.
  • 116.
    Filtering Using RelationshipProperties Find all people that gave the movie ‘The Da Vinci Code’ a rating of 65 and return their names. MATCH (p:Person)-[:REVIEWED {rating: 65}]->(:Movie {title: 'The Da Vinci Code'}) RETURN p.name Property filter 116
  • 117.
  • 118.
    Patterns in theGraph 118
  • 119.
    Using Patterns forQueries MATCH (p:Person)-[:FOLLOWS]->(:Person {name:'Angela Scope'}) RETURN p Looking for people that follow Angela 119
  • 120.
    120 Reversing the Traversal MATCH(p:Person)<-[:FOLLOWS]-(:Person {name:'Angela Scope'}) RETURN p Looking for people that Angela follows
  • 121.
    121 Querying a Relationshipin Both Directions MATCH (p1:Person)-[:FOLLOWS]-(p2:Person {name:'Angela Scope'}) RETURN p1, p2
  • 122.
    Traversing Multiple Relationships Queryto return all followers of the followers of Jessica Thompson MATCH (p:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]-> (:Person {name:'Jessica Thompson'}) RETURN p 122
  • 123.
    123 Using Patterns toFocus the Query MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person) RETURN a.name, m.title, d.name
  • 124.
    Returning Paths MATCH path= (:Person)-[:FOLLOWS]->(:Person)-[:FOLLOWS]->(:Person {name:'Jessica Thompson'}) RETURN path Path assigned to variable path 124
  • 125.
    Returning Multiple Paths MATCHpath = (:Person)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(:Person {name:'Ron Howard'}) RETURN path Best practice ● Specify direction in MATCH statements ● It optimizes queries, especially for larger graphs 125
  • 126.
    Here are theNeo4j-recommended Cypher coding standards: ● Node labels are PascalCase and case-sensitive (examples: Person, NetworkAddress). ● Property keys, variables, parameters, aliases, and functions are camelCase and case-sensitive (examples: businessAddress, title). ● Relationship types are in upper-case and can use the underscore. (examples: ACTED_IN, FOLLOWS). ● Cypher keywords are upper-case (examples: MATCH, RETURN). 126 Cypher Style Recommendations (1 of 2)
  • 127.
    Here are moreNeo4j-recommended Cypher coding standards: ● String constants are in single quotes. ● Specify variables only when needed for use later in the Cypher statement. ● Place named nodes and relationships (that use variables) before anonymous nodes and relationships in your MATCH clauses when possible. ● Specify anonymous relationships with -->, --, or <-- 127 Cypher Style Recommendations (2 of 2)
  • 128.
    Exercise 3: FilteringQueries Using Relationships In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 3 :play 4.0-intro-neo4j-exercises Note This exercise has 6 steps. Estimated time to complete: 15 minutes 128
  • 129.
  • 130.
    Question 1 Suppose youhave a graph that contains nodes representing customers and other business entities for your application. The node label in the database for a customer is Customer. Each Customer node has a property named email that contains the customer’s email address. What Cypher query do you execute to return the email addresses for all customers in the graph? Select the correct answer: ❏ MATCH (n) RETURN n.Customer.email ❏ MATCH (c:Customer) RETURN c.email ❏ MATCH (Customer) RETURN email ❏ MATCH (c) RETURN Customer.email 130
  • 131.
    Question 1 Suppose youhave a graph that contains nodes representing customers and other business entities for your application. The node label in the database for a customer is Customer. Each Customer node has a property named email that contains the customer’s email address. What Cypher query do you execute to return the email addresses for all customers in the graph? Select the correct answer: ❏ MATCH (n) RETURN n.Customer.email ❏ MATCH (c:Customer) RETURN c.email ❏ MATCH (Customer) RETURN email ❏ MATCH (c) RETURN Customer.email 131
  • 132.
    Question 2 Suppose youhave a graph that contains Customer and Product nodes. A Customer node can have a BOUGHT relationship with a Product node. Customer nodes can have other relationships with Product nodes. A Customer node has a property named customerName. A Product node has a property named productName. What Cypher query do you execute to return all of the products (by name) bought by customer 'ABCCO'. Select the correct answer: ❏ MATCH (c:Customer {customerName: 'ABCCO'}) RETURN c.BOUGHT.productName ❏ MATCH (:Customer 'ABCCO')-[:BOUGHT]→(p:Product) RETURN p.productName ❏ MATCH (p:Product)←[:BOUGHT_BY]-(:Customer 'ABCCO') RETURN p.productName ❏ MATCH (:Customer {customerName: 'ABCCO'})-[:BOUGHT]→(p:Product) RETURN p.productName 132
  • 133.
    Question 2 Suppose youhave a graph that contains Customer and Product nodes. A Customer node can have a BOUGHT relationship with a Product node. Customer nodes can have other relationships with Product nodes. A Customer node has a property named customerName. A Product node has a property named productName. What Cypher query do you execute to return all of the products (by name) bought by customer 'ABCCO'. Select the correct answer: ❏ MATCH (c:Customer {customerName: 'ABCCO'}) RETURN c.BOUGHT.productName ❏ MATCH (:Customer 'ABCCO')-[:BOUGHT]→(p:Product) RETURN p.productName ❏ MATCH (p:Product)←[:BOUGHT_BY]-(:Customer 'ABCCO') RETURN p.productName ❏ MATCH (:Customer {customerName: 'ABCCO'})-[:BOUGHT]→(p:Product) RETURN p.productName 133
  • 134.
    Question 3 When mustyou use a variable in a MATCH clause? Select the correct answer: ❏ When you want to query the graph using a node label ❏ When you specify a property value to match the query ❏ When you want to use the node or relationship to return a value ❏ When the query involves 2 types of nodes 134
  • 135.
    Question 3 When mustyou use a variable in a MATCH clause? Select the correct answer: ❏ When you want to query the graph using a node label ❏ When you specify a property value to match the query ❏ When you want to use the node or relationship to return a value ❏ When the query involves 2 types of nodes 135
  • 136.
    Summary You should nowbe able to write Cypher statements to: ● Retrieve nodes from the graph ● Filter nodes retrieved using labels and node property values ● Retrieve node property values ● Filter retrieved nodes using relationships
  • 137.
    Using WHERE toFilter Queries
  • 138.
    In This ModuleYou’ll Learn ... How to write Cypher WHERE clauses for testing: ● Equality ● Multiple values ● Ranges ● Labels ● Existence of a property ● String values ● Regular expressions ● Patterns in the graph ● Inclusion in a list 138
  • 139.
  • 140.
    Cypher WHERE Not usingWHERE: Using the WHERE clause: MATCH (p:Person)-[:ACTED_IN]->(m:Movie {released: 2008}) RETURN p, m MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE m.released = 2008 RETURN p, m 140
  • 141.
    Querying Multiple Values MATCH(p:Person)-[:ACTED_IN]->(m:Movie) WHERE m.released = 2008 OR m.released = 2009 RETURN p, m When working with WHERE a variable is required for each value 141
  • 142.
  • 143.
    Querying Ranges Find allpeople who acted in movies released between 2003 and 2004 MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE m.released >= 2003 AND m.released <= 2004 RETURN p.name, m.title, m.released MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE 2003 <= m.released <= 2004 // floor and ceiling notation RETURN p.name, m.title, m.released Supported comparison operators: =, <>, > , <=, >=, IS NULL, IS NOT NULL 143
  • 144.
  • 145.
  • 146.
    Querying Using Labels MATCH(p:Person)-[:ACTED_IN]->(:Movie {title: 'The Matrix'}) RETURN p.name MATCH (p)-[:ACTED_IN]->(m) WHERE p:Person AND m:Movie AND m.title='The Matrix' RETURN p.name MATCH (p:Person) RETURN p.name MATCH (p) WHERE p:Person RETURN p.name Simplification of the two queries above showing only label Person and variable p 146
  • 147.
    Existence of aProperty 147
  • 148.
    Filter on Existenceof a Property MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name ='Jack Nicholson' AND exists(m.tagline) RETURN m.title, m.tagline 148
  • 149.
  • 150.
    150 Querying using Strings Findall actors whose first name is Michael MATCH (p:Person)-[:ACTED_IN]->() WHERE p.name STARTS WITH 'Michael' RETURN p.name
  • 151.
    String Comparisons ● Stringcomparisons are case-sensitive ● Use toLower( ) and toUpper( ) ● Indexes are not used if a property value has been converted with a function MATCH (p:Person)-[:ACTED_IN]->() WHERE toLower(p.name) STARTS WITH 'michael' RETURN p.name 151
  • 152.
  • 153.
    153 Querying with Regular Expressions ●Indexes are never used for regular expression ● The property value must fully match the regular expression MATCH (p:Person) WHERE p.name =~'Tom.*' RETURN p.name
  • 154.
  • 155.
    Patterns (1 of3) Return all Person nodes of people who wrote movies MATCH (p:Person)-[:WROTE]->(m:Movie) RETURN p.name, m.title 155
  • 156.
    Patterns (2 of3) The query is modified to exclude people who directed that particular movie MATCH (p:Person)-[:WROTE]->(m:Movie) WHERE NOT exists( (p)-[:DIRECTED]->(m) ) RETURN p.name, m.title 156
  • 157.
    Patterns (3 of3) Find Gene Hackman and ... ● The movies that he ACTED-IN with another person who also DIRECTED the movie MATCH (gene:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(other:Person) WHERE gene.name= 'Gene Hackman' AND exists( (other)-[:DIRECTED]->(m) ) RETURN gene, other, m 157
  • 158.
  • 159.
    List Values Retrieve Person nodes ofpeople born in 1965 or 1970 MATCH (p:Person) WHERE p.born IN [1965, 1970] RETURN p.name as name, p.born as yearBorn 159
  • 160.
    List Values inthe Graph Later in this course, you will learn how to create lists from your queries by aggregating data in the graph. There are a number of syntax elements of Cypher that we have not covered in this training. For example, you can specify CASE logic in your conditional testing for your WHERE clauses. You can learn more about these syntax elements in the Neo4j Cypher Manual and the Cypher Refcard. MATCH (p:Person)-[r:ACTED_IN]->(m:Movie) WHERE 'Neo' IN r.roles AND m.title='The Matrix' RETURN p.name 160
  • 161.
    Exercise 4: FilteringQueries Using the WHERE Clause In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 4 :play 4.0-intro-neo4j-exercises Note This exercise has 6 steps. Estimated time to complete: 30 minutes 161
  • 162.
  • 163.
    Question 1 Suppose youwant to add a WHERE clause at the end of this statement to filter the results retrieved. MATCH (p:Person)-[rel]->(m:Movie)<-[:PRODUCED]-(:Person) What variables, can you test in the WHERE clause? Select the correct answers. ❏ p ❏ rel ❏ m ❏ PRODUCED
  • 164.
    Question 1 Suppose youwant to add a WHERE clause at the end of this statement to filter the results retrieved. MATCH (p:Person)-[rel]->(m:Movie)<-[:PRODUCED]-(:Person) What variables, can you test in the WHERE clause? Select the correct answers. ❏ p ❏ rel ❏ m ❏ PRODUCED
  • 165.
    Question 2 Suppose youwant to retrieve all movies that have a released property value that is 2000, 2002, 2004, 2006, or 2008. Here is an incomplete Cypher example to return the title property values of all movies released in these years. What keyword do you specify for XX? MATCH (m:Movie) WHERE m.released XX [2000, 2002, 2004, 2006, 2008] RETURN m.title Select the correct answer: ❏ CONTAINS ❏ IN ❏ IS ❏ EQUALS
  • 166.
    Question 2 Suppose youwant to retrieve all movies that have a released property value that is 2000, 2002, 2004, 2006, or 2008. Here is an incomplete Cypher example to return the title property values of all movies released in these years. What keyword do you specify for XX? MATCH (m:Movie) WHERE m.released XX [2000, 2002, 2004, 2006, 2008] RETURN m.title Select the correct answer: ❏ CONTAINS ❏ IN ❏ IS ❏ EQUALS
  • 167.
    Question 3 We wanta query that returns the names of any people who both acted in and wrote the same movie. What query will retrieve this data? Select the correct answer. ❏ MATCH (p:Person) WHERE (p)-[:WROTE]-(m) AND (p)-[WROTE]-(m) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie) WHERE (p)-[:WROTE]-(m) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN | WROTE]→(m:Movie) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie)←[WROTE]-(p) RETURN p.name, m.title
  • 168.
    Question 3 We wanta query that returns the names of any people who both acted in and wrote the same movie. What query will retrieve this data? Select the correct answer. ❏ MATCH (p:Person) WHERE (p)-[:WROTE]-(m) AND (p)-[WROTE]-(m) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie) WHERE (p)-[:WROTE]-(m) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN | WROTE]→(m:Movie) RETURN p.name, m.title ❏ MATCH (p:Person)-[:ACTED_IN]→(m:Movie)←[WROTE]-(p) RETURN p.name, m.title
  • 169.
    Summary You should nowbe able to write Cypher WHERE clauses to test: ● Equality ● Multiple values ● Ranges ● Labels ● Existence of a property 169 ● String values ● Regular expressions ● Patterns in the graph ● Inclusion in a list
  • 170.
  • 171.
    In This ModuleYou’ll Learn ... How to write Cypher statements to ... ● Specify multiple MATCH clauses ● Specify multiple MATCH patterns ● Specify varying length paths ● Return a subgraph ● Specify OPTIONAL in a query 171
  • 172.
  • 173.
    173 Traversal in aMATCH Clause Find all of the followers of people who reviewed the movie titled The Replacements MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie) WHERE m.title = 'The Replacements' RETURN follower.name, reviewer.name
  • 174.
  • 175.
    175 Multiple Patterns ina MATCH MATCH (a:Person)-[:ACTED_IN]->(m:Movie), (m)<-[:DIRECTED]-(d:Person) WHERE m.released = 2000 RETURN a.name, m.title, d.name
  • 176.
    A Single Patternin a MATCH Another way to write this same query ... MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person) WHERE m.released = 2000 RETURN a.name, m.title, d.name 176
  • 177.
    Required Two Patternsin a MATCH MATCH (meg:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person), (other:Person)-[:ACTED_IN]->(m) WHERE meg.name = 'Meg Ryan' RETURN m.title AS movie, d.name AS director , other.name AS `co-actors` 177
  • 178.
    Two Patterns ina MATCH MATCH (keanu:Person)-[:ACTED_IN]->(movie:Movie)<-[:ACTED_IN]-(n:Person), (hugo:Person) WHERE keanu.name='Keanu Reeves' AND hugo.name='Hugo Weaving' AND NOT (hugo)-[:ACTED_IN]->(movie) RETURN n.name 178
  • 179.
    Traversal With Patterns MATCH(valKilmer:Person)-[:ACTED_IN]->(m:Movie) MATCH (actor:Person)-[:ACTED_IN]->(m) WHERE valKilmer.name = 'Val Kilmer' RETURN m.title AS movie, actor.name 179
  • 180.
    Traversal Multiple Patterns MATCH(valKilmer:Person)-[:ACTED_IN]->(m:Movie), (actor:Person)-[:ACTED_IN]->(m) WHERE valKilmer.name = 'Val Kilmer' RETURN m.title as movie , actor.name 180
  • 181.
  • 182.
    Varying Length Paths MATCH(follower:Person)-[:FOLLOWS*2]->(p:Person) WHERE follower.name = 'Paul Blythe' RETURN p.name 182
  • 183.
    Varying Length Patterns(1 of 2) Retrieve all paths of any length with relationship … :RELTYPE from nodeA to nodeB and beyond (nodeA)-[:RELTYPE*]->(nodeB) (nodeA)-[:RELTYPE*]-(nodeB) Retrieve all paths of any length with the relationship :RELTYPE from nodeA to nodeB or from nodeB to nodeA and beyond Usually this is a very expensive query so limit the retrieved nodes Direction removed 183
  • 184.
    Varying Length Patterns(2 of 2) Retrieve the paths of length 3 with the relationship … :RELTYPE from nodeA to nodeB Retrieve the paths of lengths 1, 2, or 3 with the relationship … :RELTYPE from nodeA to nodeB, nodeB to nodeC and from nodeC to _nodeD (up to 3 hops) (node1)-[:RELTYPE*3]->(node2) (node1)-[:RELTYPE*1..3]->(node2) 184
  • 185.
    Finding the ShortestPath MATCH p = shortestPath((m1:Movie)-[*]-(m2:Movie)) WHERE m1.title = 'A Few Good Men' AND m2.title = 'The Matrix' RETURN p 185
  • 186.
  • 187.
    Returning a Subgraph MATCHpaths = (m:Movie)-[rel]-(p:Person) WHERE m.title = 'The Replacements' RETURN paths 187
  • 188.
  • 189.
    189 Specifying Optional PatternMatching Subgraph of the movies graph with all people named James and their relationships MATCH (p:Person) WHERE p.name STARTS WITH 'James' OPTIONAL MATCH (p)-[r:REVIEWED]->(m:Movie) RETURN p.name, type(r), m.title
  • 190.
    Exercise 5: Workingwith Patterns in Queries In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 5 :play 4.0-intro-neo4j-exercises Note This exercise has 6 steps. Estimated time to complete: 30 minutes 190
  • 191.
  • 192.
    Question 1 Given thisCypher query: MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie) WHERE m.title = 'The Replacements' RETURN follower.name, reviewer.name What is the first node that is retrieved by the query engine? Select the correct answer: ❏ The first Person node with a FOLLOWS relationship ❏ The first Person node with a REVIEWED relationship ❏ The Movie node for the movie, The Replacements ❏ The first Movie node in the alphabetical list of movies in the graph
  • 193.
    Question 1 Given thisCypher query: MATCH (follower:Person)-[:FOLLOWS]->(reviewer:Person)-[:REVIEWED]->(m:Movie) WHERE m.title = 'The Replacements' RETURN follower.name, reviewer.name What is the first node that is retrieved by the query engine? Select the correct answer: ❏ The first Person node with a FOLLOWS relationship ❏ The first Person node with a REVIEWED relationship ❏ The Movie node for the movie, The Replacements ❏ The first Movie node in the alphabetical list of movies in the graph
  • 194.
    Question 2 We wanta query that returns a list of people who acted in movies released later than 2005 and for those movies, also return title and released year of the movie, as well as the name of the writer. How can you correct this query? MATCH (a:Person)-[:ACTED_IN]->(m:Movie) (m)<-[:WROTE]-(w:Person) WHERE m.released > 2005 RETURN a.name, m.title, m.released, w.name Select the correct answer: ❏ The second line should be: (m2:Movie)←[:WROTE]-(w:Person). ❏ Add a comma after the first pattern in the MATCH clause. ❏ The second line should be: (m2:Movie)←[:WROTE]-(a). ❏ Add a MATCH clause at the beginning of the second line.
  • 195.
    Question 2 We wanta query that returns a list of people who acted in movies released later than 2005 and for those movies, also return title and released year of the movie, as well as the name of the writer. How can you correct this query? MATCH (a:Person)-[:ACTED_IN]->(m:Movie) (m)<-[:WROTE]-(w:Person) WHERE m.released > 2005 RETURN a.name, m.title, m.released, w.name Select the correct answer: ❏ The second line should be: (m2:Movie)←[:WROTE]-(w:Person). ❏ Add a comma after the first pattern in the MATCH clause. ❏ The second line should be: (m2:Movie)←[:WROTE]-(a). ❏ Add a MATCH clause at the beginning of the second line.
  • 196.
    Question 3 Suppose youhave a graph of Person nodes representing a social network graph. A Person node can have a IS_FRIENDS_WITH relationship with any other Person node. Like in Facebook, there can be a long path of connections between people. What Cypher MATCH clause would you use to find all people in this graph that are two to four hops away from each other? Select the correct answer: ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2..4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2-4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2-4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2,4]→(p2.Person)
  • 197.
    Question 3 Suppose youhave a graph of Person nodes representing a social network graph. A Person node can have a IS_FRIENDS_WITH relationship with any other Person node. Like in Facebook, there can be a long path of connections between people. What Cypher MATCH clause would you use to find all people in this graph that are two to four hops away from each other? Select the correct answer: ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2..4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH*2-4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2-4]→(p2.Person) ❏ MATCH (p:Person)-[:IS_FRIENDS_WITH,2,4]→(p2.Person)
  • 198.
    Summary You should nowbe able to write Cypher statements to ... ● Specify multiple MATCH clauses ● Specify multiple MATCH patterns ● Specify varying length paths ● Return a subgraph ● Specify OPTIONAL MATCH in a query 198
  • 199.
  • 200.
    In This ModuleYou’ll Learn ... How to write Cypher statements to: ● Aggregate data into lists ● Work with lists ● Count results returned ● Work with maps ● Work with dates 200
  • 201.
  • 202.
    Automatic Grouping inCypher MATCH (p:Person)-[:REVIEWED]->(m:Movie) RETURN p.name, m.title 202 Movie titles default grouping By default Cypher automatically returns values grouped by a common value
  • 203.
    Aggregation Using collect() MATCH(p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name ='Tom Cruise' RETURN collect(m.title) AS `movies for Tom Cruise` 203
  • 204.
    204 Collecting Nodes ● Returnedas a graph ● The same as simply returning m MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name ='Tom Cruise' RETURN collect(m) AS `movies for Tom Cruise` ● Result viewed as a table ● Each node is an object in the list
  • 205.
    Aggregation Using count() MATCH(a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person) RETURN a.name, d.name, count(m) 205
  • 206.
    Counting and Collecting MATCH (actor:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(director:Person) RETURNactor.name, director.name, count(m) AS collaborations, collect(m.title) AS movies 206
  • 207.
    Using collect() andsize() Using size() is an alternative to using count() ● size() returns the number of elements in a list ● count() returns the count for a set. This query shows returns the same result: MATCH (actor:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(director:Person) RETURN actor.name, director.name, size(collect(m)) AS collaborations, collect(m.title) AS movies 207
  • 208.
    208 Working With CypherData Movie nodes have 3 properties ● 2 of type String ● 1 of type Integer
  • 209.
  • 210.
    Lists Return the castlist for every movie and the size of the cast MATCH (a:Person)-[:ACTED_IN]->(m:Movie) RETURN m.title, collect(a) AS cast, size(collect(a)) AS castSize 210
  • 211.
    Using Strings inLists Modifying the query slightly ... ● The list contains the names, instead of the entire set of Person node properties MATCH (a:Person)-[:ACTED_IN]->(m:Movie) RETURN m.title, collect(a.name) AS cast, size(collect(a.name)) AS castSize 211
  • 212.
    212 Accessing Elements ofthe List MATCH (a:Person)-[:ACTED_IN]->(m:Movie) RETURN m.title, collect(a.name)[0] AS `A cast member`, size(collect(a.name)) AS castSize
  • 213.
  • 214.
    Working With Maps RETURN{Jan: 31, Feb: 28, Mar: 31, Apr: 30 , May: 31, Jun: 30 , Jul: 31, Aug: 31, Sep: 30, Oct: 31, Nov: 30, Dec: 31}['Feb'] AS DaysInFeb 214
  • 215.
    Accessing Map Elements Amap is returned ... ● when a returned node is displayed using Table in Neo4j Browser The returned Movie nodes are displayed here as a map 215
  • 216.
    Map Projections MATCH (m:Movie) WHEREm.title CONTAINS 'Matrix' RETURN m { .title, .released } AS movie 216
  • 217.
  • 218.
    Working With Dates RETURNdate(), datetime(), time(), timestamp() 218
  • 219.
    Accessing Components ofDates RETURN date().day, date().year, datetime().year, datetime().hour, datetime().minute 219 These functions work with strings
  • 220.
  • 221.
    Type and DataConversions Here are some of the built-in conversion functions: ● toInteger() ● toLower() ● toUpper( ) ● toString() Consult the Neo4j Cypher Manual for more information ● It includes much more on the built-in functions that are available 221
  • 222.
    Exercise 6: Workingwith Cypher Data In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 6 :play 4.0-intro-neo4j-exercises Note This exercise has 6 steps. Estimated time to complete: 15 minutes 222
  • 223.
  • 224.
    Question 1 What functionsbelow aggregate results: Select the correct answers: ❏ count() ❏ size() ❏ map() ❏ collect()
  • 225.
    Question 1 What functionsbelow aggregate results: Select the correct answers: ❏ count() ❏ size() ❏ map() ❏ collect()
  • 226.
    Question 2 What constructbest represents a node in the graph? Select the correct answer: ❏ list ❏ map ❏ collection ❏ blob
  • 227.
    Question 2 What constructbest represents a node in the graph? Select the correct answer: ❏ list ❏ map ❏ collection ❏ blob
  • 228.
    Question 3 Which date/timerelated function returns a long integer value? Select the correct answer: ❏ date() ❏ datetime() ❏ time() ❏ timestamp()
  • 229.
    Question 3 Which date/timerelated function returns a long integer value? Select the correct answer: ❏ date() ❏ datetime() ❏ time() ❏ timestamp()
  • 230.
    Summary You should nowbe able to: ● Aggregate data into lists ● Work with lists ● Count results returned ● Work with maps ● Work with dates 230
  • 231.
  • 232.
    In This ModuleYou’ll Learn ... How to write Cypher statements to: ● Perform intermediate processing with WITH ● Using WITH and UNWIND for query processing ● Perform subqueries with WITH ● Perform subqueries with CALL 232
  • 233.
  • 234.
    Intermediate Processing UsingWITH Return each actor ... ● the number of movies they acted in ● and the titles of the movies MATCH (a:Person)-[:ACTED_IN]->(m:Movie) RETURN a.name, count(a) AS numMovies, collect(m.title) AS movies 234
  • 235.
    Using WITH Existing variablesmust be specified in the WITH to be available for reference later in the query MATCH (a:Person)-[:ACTED_IN]->(m:Movie) WITH a, count(a) AS numMovies, collect(m.title) AS movies WHERE 1 < numMovies < 4 RETURN a.name, numMovies, movies 235
  • 236.
  • 237.
    Using WITH andUNWIND When importing data into a graph - WITH and UNWIND are frequently utilized MATCH (m:Movie)<-[:ACTED_IN]-(p:Person) WITH collect(p) AS actors, count(p) AS actorCount, m UNWIND actors AS actor RETURN m.title, actorCount, actor.name 237
  • 238.
    Subqueries with WITH MATCH(m:Movie)<-[rv:REVIEWED]-(r:Person) WITH m, rv, r MATCH (m)<-[:DIRECTED]-(d:Person) RETURN m.title, rv.rating, r.name, collect(d.name) 238
  • 239.
  • 240.
    240 Subquery MATCH (p:Person) WITH p,size((p)-[:ACTED_IN]->()) AS movies WHERE movies >= 5 OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie) RETURN p.name, m.title
  • 241.
    Performing Subqueries withCALL Variable m in the subquery is used again in the next query CALL {MATCH (p:Person)-[:REVIEWED]->(m:Movie) RETURN m} MATCH (m) WHERE m.released=2000 RETURN m.title, m.released 241
  • 242.
    Exercise 7: ControllingQuery Processing In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 7 :play 4.0-intro-neo4j-exercises Note This exercise has 5 steps. Estimated time to complete: 15 minutes 242
  • 243.
  • 244.
    Question 1 Given thiscode snippet, what variables can you use in the RETURN clause? MATCH (a:Person)-[r:ACTED_IN]->(m:Movie) WITH a, count(a) AS numMovies WHERE 1 < numMovies < 4 RETURN ?? Select the correct answers: ❏ a ❏ r ❏ m ❏ numMovies
  • 245.
    Question 1 Given thiscode snippet, what variables can you use in the RETURN clause? MATCH (a:Person)-[r:ACTED_IN]->(m:Movie) WITH a, count(a) AS numMovies WHERE 1 < numMovies < 4 RETURN ?? Select the correct answers: ❏ a ❏ r ❏ m ❏ numMovies
  • 246.
    Question 2 What clausesenable you to perform subqueries? Select the correct answers: ❏ SUBMATCH ❏ WITH ❏ QUERY ❏ CALL
  • 247.
    Question 2 What clausesenable you to perform subqueries? Select the correct answers: ❏ SUBMATCH ❏ WITH ❏ QUERY ❏ CALL
  • 248.
    Question 3 Given thisCypher query, what Cypher clause do you use here to turn the list of movies into rows? MATCH (m:Movie)<-[:ACTED_IN]-(p:Person) WITH collect(m) AS movies,count(m) AS movieCount, p ?? movies AS movie RETURN p.name, movieCount, movie.title Select the correct answer: ❏ ELEMENTS ❏ UNWIND ❏ ROWS ❏ SELECT
  • 249.
    Question 3 Given thisCypher query, what Cypher clause do you use here to turn the list of movies into rows? MATCH (m:Movie)<-[:ACTED_IN]-(p:Person) WITH collect(m) AS movies,count(m) AS movieCount, p ?? movies AS movie RETURN p.name, movieCount, movie.title Select the correct answer: ❏ ELEMENTS ❏ UNWIND ❏ ROWS ❏ SELECT
  • 250.
    Summary You should nowbe able to write Cypher statements to: ● Perform intermediate processing with WITH ● Using WITH and UNWIND for query processing ● Perform subqueries with WITH ● Perform subqueries with CALL 250
  • 251.
  • 252.
    In This ModuleYou’ll Learn ... How to write Cypher statements to: ● Eliminate duplication in results ● Order results ● Limit the number of results 252
  • 253.
  • 254.
    Example with Duplicate Results MATCH(p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie) WHERE p.name = 'Tom Hanks' RETURN m.title, m.released Returned 13 records 254
  • 255.
    Eliminating Duplication Eliminate duplicates using DISTINCT MATCH(p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie) WHERE p.name = 'Tom Hanks' RETURN DISTINCT m.title, m.released Returned 12 records 255
  • 256.
    Duplication in Lists MATCH(p:Person)-[:ACTED_IN | DIRECTED | WROTE]->(m:Movie) WHERE m.released = 2003 RETURN m.title, collect(p.name) AS credits Duplicates 256
  • 257.
    Eliminating Duplication inLists MATCH (p:Person)-[:ACTED_IN | DIRECTED | WROTE]->(m:Movie) WHERE m.released = 2003 RETURN m.title, collect(DISTINCT p.name) AS credits 257
  • 258.
    258 WITH and DISTINCTto Eliminate Duplication MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie) WHERE p.name = 'Tom Hanks' WITH DISTINCT m RETURN m.released, m.title
  • 259.
  • 260.
    Ordering Results MATCH (p:Person)-[:DIRECTED| ACTED_IN]->(m:Movie) WHERE p.name = 'Tom Hanks' OR p.name = 'Keanu Reeves' RETURN DISTINCT m.title, m.released ORDER BY m.released DESC 260
  • 261.
    Ordering Multiple Results Thereis no limit how many times ORDER BY can be used in a query MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie) WHERE p.name = 'Tom Hanks' OR p.name = 'Keanu Reeves' RETURN DISTINCT m.title, m.released ORDER BY m.released DESC, m.title Rows sorted by release date descending, and by title 261
  • 262.
  • 263.
    Limiting the Numberof Results MATCH (m:Movie) RETURN m.title as title, m.released as year ORDER BY m.released DESC LIMIT 10 Returned 10 records 263
  • 264.
    Limiting Number ofIntermediate Results MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WITH m, p LIMIT 6 RETURN collect(p.name), m.title 264
  • 265.
    Another Example UsingLIMIT Note: This display in Neo4j Browser is with Connect result nodes unchecked MATCH (m:Movie) WITH m LIMIT 5 MATCH path = (m)<-[:ACTED_IN]-(:Person) WITH m, collect(path) AS paths RETURN m, paths[0..2] 265
  • 266.
    Alternative to LIMIT Analternative to the code above: MATCH (a:Person)-[:ACTED_IN]->(m:Movie) WITH a, collect(m.title) AS movies WHERE size(movies) = 5 RETURN a.name, movies MATCH (a:Person)-[:ACTED_IN]->(m:Movie) WITH a, count(*) AS numMovies, collect(m.title) AS movies WHERE numMovies = 5 RETURN a.name, numMovies, movies 266
  • 267.
    Exercise 8: ControllingResults Returned In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 8 :play 4.0-intro-neo4j-exercises Note This exercise has 5 steps. Estimated time to complete: 15 minutes 267
  • 268.
  • 269.
    Question 1 This codereturns the titles of all movies that have been reviewed. Multiple people can review a movie. How can you change this code so that a movie title will only be returned once? MATCH (m:Movie)<-[:REVIEWED]-() RETURN m.title Select the correct answers: ❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN DISTINCT m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN UNIQUE m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() WITH DISTINCT m RETURN m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() WITH UNIQUE m RETURN m.title
  • 270.
    Question 1 This codereturns the titles of all movies that have been reviewed. Multiple people can review a movie. How can you change this code so that a movie title will only be returned once? MATCH (m:Movie)<-[:REVIEWED]-() RETURN m.title Select the correct answers: ❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN DISTINCT m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() RETURN UNIQUE m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() WITH DISTINCT m RETURN m.title ❏ MATCH (m:Movie)←[:REVIEWED]-() WITH UNIQUE m RETURN m.title
  • 271.
    Question 2 How manyproperty values can you order in the returned result? Select the correct answer: ❏ One ❏ As many as needed ❏ Two ❏ Three
  • 272.
    Question 2 How manyproperty values can you order in the returned result? Select the correct answer: ❏ One ❏ As many as needed ❏ Two ❏ Three
  • 273.
    Question 3 We wantto retrieve the names of the five oldest persons in our dataset. What code will do this? Select the correct answers: ❏ MATCH (p:Person)-[:ACTED_IN]→() WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY p.born ❏ MATCH (p:Person) WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY p.born ❏ MATCH (p:Person)-[:ACTED_IN]→() RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5 ❏ MATCH (p:Person) RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5
  • 274.
    Question 3 We wantto retrieve the names of the five oldest persons in our dataset. What code will do this? Select the correct answers: ❏ MATCH (p:Person)-[:ACTED_IN]→() WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY p.born ❏ MATCH (p:Person) WITH p LIMIT 5 RETURN DISTINCT p.name, p.born ORDER BY p.born ❏ MATCH (p:Person)-[:ACTED_IN]→() RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5 ❏ MATCH (p:Person) RETURN DISTINCT p.name, p.born ORDER BY p.born LIMIT 5
  • 275.
    Summary You should nowbe able to write Cypher statements to : ● Eliminate duplication in results ● Order results ● Limit the number of results 275
  • 276.
  • 277.
    277 Overview At the endof this module, you should be able to write Cypher statements to: ● Create a node: ■ Add and remove node labels. ■ Add and remove node properties. ■ Update properties. ● Create a relationship: ■ Add and remove properties for a relationship. ● Delete a node. ● Delete a relationship. ● Merge data in a graph: ■ Create nodes. ■ Create relationships.
  • 278.
    Creating a node 278 CREATE(:Movie {title: 'Batman Begins'}) Create a node of type Movie with the title property set to Batman Begins: CREATE (:Movie:Action {title: 'Batman Begins'}) Create a node of type Movie with the title property set to Batman Begins and return the node: CREATE (m:Movie {title: 'Batman Begins'}) RETURN m Create a node of type Movie and Action with the title property set to Batman Begins: <id> is set by the graph engine
  • 279.
    Creating multiple nodes 279 CREATE(:Person {name: 'Michael Caine', born: 1933}), (:Person {name: 'Liam Neeson', born: 1952}), (:Person {name: 'Katie Holmes', born: 1978}), (:Person {name: 'Benjamin Melniker', born: 1913}) Create some Person nodes for actors and the director for the movie, Batman Begins: Important: The graph engine will create a node with the same properties of a node that already exists. You can prevent this from happening in one of two ways: 1. You can use `MERGE` rather than `CREATE` when creating the node. 2. You can add constraints to your graph. Then an attempt to create “duplicate” node will result in an error.
  • 280.
    Adding a labelto a node 280 MATCH (m:Movie) WHERE m.title = 'Batman Begins' SET m:Action RETURN labels(m) Add the Action label to the movie, Batman Begins, return all labels for this node:
  • 281.
    Removing a labelfrom a node 281 MATCH (m:Movie:Action) WHERE m.title = 'Batman Begins' REMOVE m:Action RETURN labels(m) Remove the Action label from the movie, Batman Begins, return all labels for this node:
  • 282.
    Adding or updatingproperties for a node 282 MATCH (m:Movie) WHERE m.title = 'Batman Begins' SET m.released = 2005, m.lengthInMinutes = 140, m.videoFormat = ’DVD’, m.grossMillions = 206.5 RETURN m Add the properties released and lengthInMinutes to the movie Batman Begins: ● If property does not exist for the node, it is added with the specified value. ● If property exists for the node, it is updated with the specified value
  • 283.
    Removing properties froma node 283 MATCH (m:Movie) WHERE m.title = 'Batman Begins' SET m.grossMillions = null REMOVE m.videoFormat RETURN m Properties can be removed in one of two ways: • Set the property value to null • Use the REMOVE keyword Remove the grossMillions and videoFormat properties:
  • 284.
    Exercise 9: CreatingNodes In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 9 :play 4.0-intro-neo4j-exercises Note: This exercise has 18 steps. Estimated time to complete: 40 minutes 284
  • 285.
    Creating a relationship 285 MATCH(a:Person), (m:Movie) WHERE a.name = 'Michael Caine' AND m.title = 'Batman Begins' CREATE (a)-[:ACTED_IN]->(m) RETURN a, m You create a relationship by: 1. Finding the “from node”. 2. Finding the “to node”. 3. Using CREATE to add the directed relationship between the nodes. Create the :ACTED_IN relationship between the Person, Michael Caine and the Movie, Batman Begins:
  • 286.
    Creating multiple relationships 286 MATCH(a:Person), (m:Movie), (p:Person) WHERE a.name = 'Liam Neeson' AND m.title = 'Batman Begins' AND p.name = 'Benjamin Melniker' CREATE (a)-[:ACTED_IN]->(m)<-[:PRODUCED]-(p) RETURN a, m, p Create the :ACTED_IN relationship between the Person, Liam Neeson and the Movie, Batman Begins and the :PRODUCED relationship between the Person, Benjamin Melniker and same movie.
  • 287.
    Adding properties torelationships 287 MATCH (a:Person), (m:Movie) WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins' AND NOT exists((a)-[:ACTED_IN]->(m)) CREATE (a)-[rel:ACTED_IN]->(m) SET rel.roles = ['Bruce Wayne','Batman'] RETURN a, m Same technique you use for creating and updating node properties. Add the roles property to the :ACTED_IN relationship from Christian Bale to Batman Begins:
  • 288.
    Removing properties fromrelationships 288 MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie) WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins' REMOVE rel.roles RETURN a, rel, m Same technique you use for removing node properties. Remove the roles property from the :ACTED_IN relationship from Christian Bale to Batman Begins:
  • 289.
    Exercise 10: Creating Relationships Inthe query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 10 :play 4.0-intro-neo4j-exercises Note: This exercise has 13 steps. Estimated time to complete: 35 minutes 289
  • 290.
    Deleting a relationship 290 MATCH(a:Person)-[rel:ACTED_IN]->(m:Movie) WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins' DELETE rel RETURN a, m Batman Begins relationships: Delete the :ACTED_IN relationship between Christian Bale and Batman Begins:
  • 291.
    After deleting therelationship from Christian Bale to Batman Begins 291 Batman Begins relationships: Christian Bale relationships:
  • 292.
    Deleting a relationshipand a node - 1 292 MATCH (p:Person)-[rel:PRODUCED]->(:Movie) WHERE p.name = 'Benjamin Melniker' DELETE rel, p Batman Begins relationships: Delete the :PRODUCED relationship between Benjamin Melniker and Batman Begins, as well as the Benjamin Melniker node:
  • 293.
    Deleting a relationshipand a node - 2 293 MATCH (p:Person) WHERE p.name = 'Liam Neeson' DELETE p Batman Begins relationships: Attempt to delete Liam Neeson and not his relationships to any other nodes:
  • 294.
    Deleting a relationshipand a node - 3 294 MATCH (p:Person) WHERE p.name = 'Liam Neeson' DETACH DELETE p Batman Begins relationships: Delete Liam Neeson and his relationships to any other nodes:
  • 295.
    Exercise 11: DeletingNodes and Relationships In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 11 :play 4.0-intro-neo4j-exercises Note: This exercise has 6 steps. Estimated time to complete: 20 minutes 295
  • 296.
    Using MERGE tocreate nodes 296 MERGE (a:Actor {name: 'Michael Caine'}) SET a.born=1933 RETURN a Current Michael Caine Person node: Add a Michael Caine Actor node with a value of 1933 for born using MERGE. The Actor node is not found so a new node is created: Resulting Michael Caine nodes: Important: Only specify properties that will have unique keys when you merge.
  • 297.
    Specifying creation behaviorfor the merge 297 MERGE (a:Person {name: 'Sir Michael Caine'}) ON CREATE SET a.born = 1934, a.birthPlace = 'London' RETURN a Current Michael Caine nodes: Add a Sir Michael Caine Person node with a born value of 1934 for born using MERGE and also set the birthPlace property: Resulting Michael Caine nodes:
  • 298.
    Specifying match behaviorfor the merge 298 MERGE (a:Person {name: 'Sir Michael Caine'}) ON CREATE SET a.born = 1934, a.birthPlace = 'UK' ON MATCH SET a.birthPlace = 'UK' RETURN a Current Michael Caine nodes: Add or update the Michael Caine Person node:
  • 299.
    Using MERGE tocreate relationships 299 MATCH (p:Person), (m:Movie) WHERE m.title = 'Batman Begins' AND p.name ENDS WITH 'Caine' MERGE (p)-[:ACTED_IN]->(m) RETURN p, m Make sure that all Person nodes with a person whose name ends with Caine are connected to the Movie node, Batman Begins.
  • 300.
    Exercise 12: MergingData in Graph In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 12 :play 4.0-intro-neo4j-exercises Note: This exercise has 16 steps. Estimated time to complete: 45 minutes 300
  • 301.
  • 302.
    Question 1 302 What Cypherclauses can you use to create a node? Select the correct answers. ❏ CREATE ❏ CREATE NODE ❏ MERGE ❏ ADD
  • 303.
    Answer 1 303 What Cypherclauses can you use to create a node? Select the correct answers. ✅ CREATE ❏ CREATE NODE ✅ MERGE ❏ ADD
  • 304.
    Question 2 304 Suppose thatyou have retrieved a node, s with a property, color: What Cypher clause do you use to delete the color property from this node? Select the correct answers. ❏ DELETE s.color ❏ SET s.color=null ❏ REMOVE s.color ❏ SET s.color=?
  • 305.
    Answer 2 305 Suppose thatyou have retrieved a node, s with a property, color: What Cypher clause do you use to delete the color property from this node? Select the correct answers. ❏ DELETE s.color ✅ SET s.color=null ✅ REMOVE s.color ❏ SET s.color=?
  • 306.
    Question 3 306 Suppose youretrieve a node, n in the graph that is related to other nodes. What Cypher clause do you write to delete this node and its relationships in the graph? Select the correct answers. ❏ DELETE n ❏ DELETE n WITH RELATIONSHIPS ❏ REMOVE n ❏ DETACH DELETE n
  • 307.
    Answer 3 307 Suppose youretrieve a node, n in the graph that is related to other nodes. What Cypher clause do you write to delete this node and its relationships in the graph? Select the correct answers. ❏ DELETE n ❏ DELETE n WITH RELATIONSHIPS ❏ REMOVE n ✅ DETACH DELETE n
  • 308.
    308 Summary You should beable to write Cypher statements to: ● Create a node: ■ Add and remove node labels. ■ Add and remove node properties. ■ Update properties. ● Create a relationship: ■ Add and remove properties for a relationship. ● Delete a node. ● Delete a relationship. ● Merge data in a graph: ■ Creating nodes. ■ Creating relationships.
  • 309.
  • 310.
    Managing constraints andnode keys 310 Automatically control the data that is added to the graph: • Uniqueness: Unique values for node properties • Existence: Required properties for nodes or relationships
  • 311.
    Ensuring that aproperty value for a node is unique 311 CREATE CONSTRAINT ON (m:Movie) ASSERT m.title IS UNIQUE Ensure that the title for a node of type Movie is unique: ● This statement will fail if there are any Movie nodes in the graph that have the same value for the title property. ● This statement will succeed if there are any Movie nodes in the graph that do not have the title property.
  • 312.
    Ensuring uniqueness usingthe constraint 312 CREATE (:Movie {title: 'The Matrix'}) After creating the constraint, we attempt to create a Movie with the title, The Matrix:
  • 313.
    Ensuring that propertiesexist 313 CREATE CONSTRAINT ON (m:Movie) ASSERT exists(m.tagline) You can create an constraint that will ensure that when a node or relationship is created or updated, a particular property must have a value: This statement failed because the Movie node for the movie, Something’s Gotta Give does not have a value for the tagline property.
  • 314.
    Creating an existsconstraint on a relationship 314 CREATE CONSTRAINT ON ()-[rel:REVIEWED]-() ASSERT exists(rel.rating) We know that in the Movie graph, all :REVIEWED relationships currently have a property, rating. We can create an existence constraint on that property as follows:
  • 315.
    Using the existsconstraint on a relationship 315 MATCH (p:Person), (m:Movie) WHERE p.name = 'Jessica Thompson' AND m.title = 'The Matrix' MERGE (p)-[:REVIEWED {summary: 'Great movie!'}]->(m) After creating this constraint, if we attempt to create a :REVIEWED relationship without setting the rating property:
  • 316.
    Retrieving constraints definedfor the graph 316 Note: Adding the method notation for this CALL statement enables you to use the call for returning results that may be used later in the Cypher statement. CALL db.constraints()
  • 317.
    Dropping constraints 317 DROP CONSTRAINTON ()-[rel:REVIEWED]-() ASSERT exists(rel.rating)
  • 318.
    Creating node keys- 1 318 CREATE CONSTRAINT ON (p:Person) ASSERT (p.name, p.born) IS NODE KEY • Unique constraint for a set of properties for a node • Is implemented as an index in the graph Suppose that in our Movie graph, we will not allow a Person node to be created where both the name and born properties are the same. We can create a constraint that will be a node key to ensure that this uniqueness for the set of properties is asserted: We attempt to create the constraint, but it fails because there is a Person node in the graph that does not have the born property set:
  • 319.
    Creating node keys- 2 319 MATCH (p:Person) WHERE NOT exists(p.born) SET p.born = 0 We then ensure that all Person nodes have a value for the born property: The creation of the node key will now be successful: Any subsequent attempt to create or modify an existing Person node with name or born values that violate the uniqueness constraint as a node key will fail:
  • 320.
    Using LOAD CSVfor Import
  • 321.
    In This ModuleYou’ll Learn ... How to: ● Prepare the graph and data for import ○ Inspect data ○ Determine if data needs to be transformed ○ Determine the size of the data that will be imported ○ Create the Constraints in the graph ● Import the data with LOAD CSV ● Create indexes for newly-loaded data https://neo4j.com/labs/apoc/4.1/import/ The APOC library adds support for importing data from various data formats, including JSON, XML, and XLS:
  • 322.
    Prepare for DataImport 322
  • 323.
    CSV File Structure Linesin CSV file contain rows of data from a data source ● Commonly this is from a table in an RDBMS For the CSV file(s) determine: ● Whether the first row contains header information ○ This contains column names for all rows in the file ● What the delimiter between each fields in a row
  • 324.
  • 325.
    Is the DataClean? 1. Check for headers that do not match 2. Are quotes used correctly? 3. If an element has no value will an empty string be used? 4. Are UTF-8 prefixes used (for example uc)? 5. Do some fields have trailing spaces? 6. Do the fields contain binary zeros? 7. Understand how lists are formed ● The default is to use colon(:) as the separator 1. Is comma(,) the delimiter? 2. Check for typos
  • 326.
    Inspect the DataFrom a URL LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/v4.0-intro-neo4j/people.csv' AS line RETURN line LIMIT 10
  • 327.
    Example: Inspect theData Stored Locally LOAD CSV WITH HEADERS FROM 'file:///people.csv' AS line RETURN line LIMIT 10
  • 328.
    Determine if DataNeeds Transformation ● toInteger() ● toFloat() For example, transform these field values to numbers as shown here:
  • 329.
    Preview the DataTransformation LOAD CSV WITH HEADERS FROM 'file:///movies1.csv' AS line RETURN toFloat(line.avgVote), line.genres, toInteger(line.movieId), line.title, toInteger(line.releaseYear) LIMIT 10
  • 330.
    Transforming Lists LOAD CSVWITH HEADERS FROM 'file:///movies1.csv' AS line RETURN toFloat(line.avgVote), split(coalesce(line.genres,""), ":"), toInteger(line.movieId), line.title, toInteger(line.releaseYear) LIMIT 10
  • 331.
    Create Constraints BeforeLoading the Data CREATE CONSTRAINT UniqueMovieIdConstraint ON (m:Movie) ASSERT m.id IS UNIQUE; CREATE CONSTRAINT UniquePersonIdConstraint ON (p:Person) ASSERT p.id IS UNIQUE
  • 332.
    Determine Size ofthe Data to be Loaded LOAD CSV WITH HEADERS FROM 'file:///people.csv' AS line RETURN count(line)
  • 333.
    Loading a LargeCSV File Two options for loading data when number of rows exceeds 100K: 1. USING PERIODIC COMMIT LOAD CSV 2. Use the APOC library https://neo4j.com/labs/apoc/4.2/graph-updates/periodic-execution/ Helpful Links: APOC: Apoc.periodic.iterate: https://neo4j.com/labs/apoc/4.2/
  • 334.
  • 335.
    Importing Nodes :auto USINGPERIODIC COMMIT 500 LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/v4.0-intro-neo4j/movies1.csv' AS row MERGE (m:Movie {id:toInteger(row.movieId)}) ON CREATE SET m.title = row.title, m.avgVote = toFloat(row.avgVote), m.releaseYear = toInteger(row.releaseYear), m.genres = split(row.genres,":") More on USING PERIODIC COMMIT - https://neo4j.com/developer/guide-import-csv/#_important_tips_for_load_csv
  • 336.
    Importing Relationships LOAD CSVWITH HEADERS FROM 'https://data.neo4j.com/v4.0-intro-neo4j/directors.csv' AS row MATCH (movie:Movie {id:toInteger(row.movieId)}) MATCH (person:Person {id: toInteger(row.personId)}) MERGE (person)-[:DIRECTED]->(movie) ON CREATE SET person:Director
  • 337.
  • 338.
    Add Indexes // Dothis only after ALL data has been imported CREATE INDEX MovieTitleIndex ON (m:Movie) FOR (m.title); CREATE INDEX PersonNameIndex ON (p:Person) FOR (p.name)
  • 339.
    Exercise 16: LOADCSV for Import In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 16 :play 4.0-intro-neo4j-exercises Note This exercise has 9 steps. Estimated time to complete: 30 minutes
  • 340.
  • 341.
    Question 1 When youexecute LOAD CSV what unit of data is read from the data source? Select the correct answer: ❏ A field ❏ All field values for a single field ❏ A row ❏ A table
  • 342.
    Question 1 When youexecute LOAD CSV what unit of data is read from the data source? Select the correct answer: ❏ A field ❏ All field values for a single field ❏ A row ❏ A table
  • 343.
    Question 2 What shouldyou add to the graph before you import using LOAD CSV? Select the correct answer: ❏ Indexes for all important queries ❏ Schema containing the names node labels that will be created ❏ Schema containing the types that will be assigned to properties during the load ❏ Uniqueness constraints
  • 344.
    Question 2 What shouldyou add to the graph before you import using LOAD CSV? Select the correct answer: ❏ Indexes for all important queries ❏ Schema containing the names node labels that will be created ❏ Schema containing the types that will be assigned to properties during the load ❏ Uniqueness constraints
  • 345.
    Question 3 In general,what is the maximum rows you can process using LOAD CSV? Select the correct answer: ❏ 1K ❏ 10K ❏ 100K ❏ 1M
  • 346.
    Question 3 In general,what is the maximum rows you can process using LOAD CSV? Select the correct answer: ❏ 1K ❏ 10K ❏ 100K ❏ 1M
  • 347.
    Summary You should nowbe able to: ● Describe the steps for importing data with Cypher ● Prepare the graph and data for import ● Import the data with LOAD CSV ● Create indexes for newly-loaded data 347
  • 348.
  • 349.
    In This ModuleYou’ll Learn ... How to: ● Use parameters in your Cypher statements ● Analyze Cypher execution ● Monitor queries 349
  • 350.
  • 351.
    Cypher Parameters ● Mostdeployed applications that use Neo4j have client code written in other languages ○ For example: using Java, Javascript, Python, and others ● In a deployed applications in almost all cases values are not hard code in Cypher statements ● Cypher parameters are used to pass values to Cyper statements 351
  • 352.
    Using Cypher Parameters InCypher, parameter names begin with $ 352 MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name = $actorName RETURN m.released, m.title ORDER BY m.released DESC At runtime, the value of $actorName is used in the Cypher statement
  • 353.
    Setting a Parameter 353 :paramactorName => 'Tom Hanks'
  • 354.
    Using the Parameter 354 MATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHEREp.name = $actorName RETURN m.released, m.title ORDER BY m.released DESC :param actorName => 'Tom Cruise'
  • 355.
    Setting Multiple Parameters 355 :params{actorName: 'Tom Cruise', movieName: 'Top Gun'}
  • 356.
  • 357.
  • 358.
  • 359.
  • 360.
    Analyzing Queries There twoways to analyze Cypher queries ● This is done by prefixing either EXPLAIN or PROFILE to the query EXPLAIN ● Provides estimates of the graph engine processing ● It does not execute the Cypher statement PROFILE ● The graph engine executes the the query ● Provides profiling information based on what occurred during execution 360
  • 361.
    Analysis Using EXPLAIN ExplainReturns a Cypher query plan A Cypher query plan shows what is expected ● Operations ● Where rows are processed ● What rows are passed on to the the next operation (step) Evaluating and comparing Cypher statements ● Use to understand the stages of processing that will occur when the Cypher executes 361
  • 362.
    Setting Parameters 362 :params {actorName:'Hugo Weaving', year: 2000}
  • 363.
    Using EXPLAIN 363 EXPLAIN MATCH(p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name = $actorName AND m.released < $year RETURN p.name, m.title, m.released
  • 364.
    Expanding the Steps 364 EXPLAINMATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name = $actorName AND m.released < $year RETURN p.name, m.title, m.released Showing all steps:
  • 365.
    Using PROFILE 365 PROFILE MATCH(p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name = $actorName AND m.released < $year RETURN p.name, m.title, m.released Showing all steps expanded
  • 366.
    Expanding PROFILE Steps 366 PROFILEMATCH (p:Person)-[:ACTED_IN]->(m:Movie) WHERE p.name = $actorName AND m.released < $year RETURN p.name, m.title, m.released
  • 367.
    PROFILE Without Node Labels 367 PROFILEMATCH (p)-[:ACTED_IN]->(m) WHERE p.name = $actorName AND m.released < $year RETURN p.name, m.title, m.released Query changed ● With Labels: (p:Person)-[:ACTED_IN]->(m:Movie) ● No Labels: (p)-[:ACTED_IN]->(m) No Labels With Labels
  • 368.
  • 369.
    Monitoring Queries Causes forlong running Cypher queries: ● The query returns a large amount of data ○ Although the query completed execution in the graph engine, it is still creating the result stream ● Query execution takes a long time to complete processing 369 MATCH (a), (b), (c), (d), (e) RETURN count(id(a)) Example B: MATCH (a)--(b)--(c)--(d)--(e)--(f)--(g) RETURN a Example A:
  • 370.
    Killing a Query 370 Kill thequery by closing the result pane
  • 371.
    Monitoring Queries ● :queriescommand 371 Browser with long running query Browser opened to monitor the query
  • 372.
  • 373.
    Killing a Long-runningQuery The :queries command is only available in Neo4j Enterprise Edition 373
  • 374.
  • 375.
    Cypher Query BestPractices ● Indexes: Create an use indexes effectively ● Parameters: Use parameters rather than literals in queries ● Labels: Specify node labels in MATCH clauses ● Rows: ○ Reduce the number of rows passed and processed ○ Reduce the rows processed by using DISTINCT and LIMIT early in query ● Aggregate: Early in the query, rather than in the RETURN clause ● Properties: Defer property access until it is needed 375
  • 376.
    Exercise 15: UsingQuery Best Practices In the query edit pane of Neo4j Browser, execute the browser command: and follow the instructions for Exercise 15 :play 4.0-intro-neo4j-exercises Note This exercise has 14 steps. Estimated time to complete: 30 minutes 376
  • 377.
  • 378.
    Question 1 What Cypherkeyword can you use to prefix any Cypher statement to examine how many db hits occurred when the statement executed? Select the correct answer: ❏ ANALYZE ❏ EXPLAIN ❏ PROFILE ❏ MONITOR
  • 379.
    Question 1 What Cypherkeyword can you use to prefix any Cypher statement to examine how many db hits occurred when the statement executed? Select the correct answer: ❏ ANALYZE ❏ EXPLAIN ❏ PROFILE ❏ MONITOR
  • 380.
    Question 2 What commandsdo you use to set values for parameters in your Neo4j Browser session? Select the correct answers: ❏ :set param ❏ :param ❏ :set params ❏ :params
  • 381.
    Question 2 What commandsdo you use to set values for parameters in your Neo4j Browser session? Select the correct answers: ❏ :set param ❏ :param ❏ :set params ❏ :params
  • 382.
    Question 3 Suppose youare executing queries in Neo4j Browser Session A and monitoring them in Neo4j Browser Session B with the :queries command. What are some ways that you can kill a query? Select the correct answers: ❏ You can close the result pane in Session A, if the query can be seen in Session B ❏ You can close the result pane in Session A, if the query can no longer be seen in Session B ❏ You can kill any running query seen in Session B ❏ You can close the Neo4j Browser that is running Session A
  • 383.
    Question 3 Suppose youare executing queries in Neo4j Browser Session A and monitoring them in Neo4j Browser Session B with the :queries command. What are some ways that you can kill a query? Select the correct answers: ❏ You can close the result pane in Session A, if the query can be seen in Session B ❏ You can close the result pane in Session A, if the query can no longer be seen in Session B ❏ You can kill any running query seen in Session B ❏ You can close the Neo4j Browser that is running Session A
  • 384.
    Summary You should nowbe able to: ● Use parameters in your Cypher statements ● Analyze Cypher execution ● Monitor queries 384
  • 385.
  • 386.
  • 387.
    387 Accessing Neo4j resources Thereare many ways that you can learn more about Neo4j. A good starting point for learning about the resources available to you is the Neo4j Learning Resources page at https://neo4j.com/developer/resources/.