INTRODUCTION TO GRAPH
DATABASES WITH NEO4J
Brant Boehmann
@tbrantb
About Me
- ~20 Years Dev Experience
- Java/JVM Specialist
- Lead Software Engineer @ScrippsNet
2
Outline
- DB Types
- What is a GraphDB
- RDBMS vs GraphDB
- Neo4J
- Cypher
- Neo4J Console
3
Multiple DB Types
- Relational
- Oracle, MS-SQL, PostgreSQL, MySQL, ...
- Columnar
- GreenPlum, RedShift, ...
- Key-Value
- Dynamo, Riak, ...
- Document
- MongoDB, CouchDB, ...
- Graph
- Neo4J, Titan, OrientDB, ...
4
Which DB type is right?
- Data Size
- Data structure
- Access Patterns
- Write Patterns
- Performance Requirements
- Consistency Requirements
- Degree of Data Connectedness
5
Remember Graph Basics
- Node (Vertex)
- Relationship (Edge)
- Direction
- Path
A
D
B
C E
6
Turn a Graph into a DB
- Node Labels
- Relationship Type
- Properties
7
Shoehorn into RDBMS Attempt #1
- Convert a Graph to Table
- Adjacency Matrix
8
Shoehorn into RDBMS Attempt #1
- Adjacency Matrix Doesn’t fit in RDBMS table :(
- Why?
- Schema / Static Columns
- Column growth
9
Shoehorn into RDBMS Attempt #2
- Join Table
- Many-To-Many
10
Who are Bob’s Friends
Pretty Simple
SELECT p2.id, p2.name
FROM person p1
JOIN person_relationship rel
ON p1.id = rel.start_person_id
AND p1.id = 1 --Bob
AND rel.relationship_type=1 --Friend
JOIN person p2
ON p2.id = rel.end_person_id
11
Who are the Friends of Bob’s Friends?
SELECT p3.id, p3.name
FROM person p1
JOIN person_relationship rel1
ON p1.id = rel1.start_person_id
AND p1.id = 1 --Bob
AND rel1.relationship_type = 1 --Friend
JOIN person p2
ON p2.id = rel1.end_person_id
JOIN person_relationship rel2
ON p2.id = rel2.start_person_id
AND rel2.relationship_type = 1 --Friend
JOIN person p3
ON p3.id = rel2.end_person_id
12
Bob’s Friends’ Friends’ Friends to the N
Do I have to?
13
What is Bob’s Bacon # ?
?
14
Not all problems
are well suited
for RDBMS
15
How do Graph DBs do this better?
- RDBMS uses joins & table/index scans
- Native vs Non-Native GraphDB
- “Index Free Adjacency”
16
Popular GraphDB Use Cases
- Social Networks
- Fraud Detection
- Recommendation Engines
- IAM
- Master Data Management (MDM)
- Network Ops
- Content/Asset Management
- Journalism (see Panama Papers)
17
Algorithms/Concepts
- Shortest Path
- A*
- Triadic Closures
18
Neo4J
- Labeled Property Graph
- Native Graph DB w/Index Free Adjacency
- JVM Based
- Fully ACID
- All relationships are directed *
- Schema-less, Schema-optional
- Embedded, Single Instance Server, HA Cluster
19
Neo4J APIs
- Embedded Java API
- HTTP API
- Drivers/Bolt Protocol
20
Cypher
- Declarative Graph Query Language
- Originated at Neo Technology
- Standardizing at OpenCypher.org
- Ascii Art-ish
21
Ascii Art-ish ?
- () look like circles or nodes
- --, -->, <-- look like relationships or edges
22
MATCH (a)-->(b)
WHERE a.name = 'Bob'
RETURN b
MATCH Clause
23
MATCH (a)-->(b)
MATCH (a)-[:FRIEND]->(b)
MATCH (a)<-[:FRIEND]-(b)
MATCH (a)-[:FRIEND]-(b)
MATCH Clause with Labels
24
MATCH (p:Person)-[:DIRECTED]->(m:Movie)
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
WHERE
25
MATCH (p:Person)-[:DIRECTED]->(m:Movie)
WHERE p.name = 'Ron Howard'
AND p.born < 1955
MATCH (p:Person {name: 'Ron Howard'})-[:DIRECTED]->(m:Movie)
RETURN, ORDER BY, LIMIT
26
MATCH (p:Person {name: 'Ron Howard'})-[:DIRECTED]->(m:Movie)
RETURN m.title, m.released, m.tagline
ORDER BY m.title
LIMIT 3
Overall Read Query Structure
27
[MATCH WHERE]
[OPTIONAL MATCH WHERE]
[WITH [ORDER BY] [SKIP] [LIMIT]]
RETURN [ORDER BY] [SKIP] [LIMIT]
CREATE
28
CREATE (matrix:Movie {title:'The Matrix', released:1999})
CREATE (keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (keanu)-[:ACTED_IN]->(matrix)
MATCH (matrix:Movie {title:'The Matrix'}),
(keanu:Person {name:'Keanu Reeves'})
CREATE (keanu)-[:ACTED_IN]->(matrix)
UPDATE
29
MATCH (matrix:Movie {title:'The Matrix'}),
(keanu:Person {name:'Keanu Reeves'})
SET matrix.tagline = 'Follow the White Rabbit',
keanu.born = 1900
RETURN matrix, keanu //optional
DELETE
30
MATCH (p:Person{name: 'Keanu Reeves'})
DELETE p
MATCH (p:Person{name: 'Keanu Reeves'})-[r]-()
DELETE r, p
MATCH (p:Person{name: 'Keanu Reeves'})
DETACH DELETE n
Remember our friends?
31
Who are Bob’s Friends
SELECT p2.id, p2.name
FROM person p1
JOIN person_relationship rel
ON p1.id = rel.start_person_id
AND p1.id = 1 --Bob
AND rel.relationship_type=1 --Friend
JOIN person p2
ON p2.id = rel.end_person_id
32
MATCH (b:Person {name:'Bob'})-[:FRIEND]->(f:Person)
RETURN f.name
Who are the Friends of Bob’s Friends?
SELECT p3.id, p3.name
FROM person p1
JOIN person_relationship rel1
ON p1.id = rel1.start_person_id
AND p1.id = 1 --Bob
AND rel1.relationship_type = 1 --Friend
JOIN person p2
ON p2.id = rel1.end_person_id
JOIN person_relationship rel2
ON p2.id = rel2.start_person_id
AND rel2.relationship_type = 1 --Friend
JOIN person p3
ON p3.id = rel2.end_person_id
33
MATCH (b:Person {name:'Bob'})-[:FRIEND]->(f:Person)-[:FRIEND]->(fof:Person)
RETURN DISTINCT fof.name
MATCH (b:Person {name:'Bob'})-[:FRIEND *2]->(fof:Person)
RETURN DISTINCT fof.name
Bob’s Friends’ Friends’ Friends to the N
34
MATCH (b:Person {name:'Bob'})-[:FRIEND *2]->(fof:Person)
RETURN DISTINCT fof.name
MATCH (b:Person {name:'Bob'})-[:FRIEND *]->(fof:Person)
RETURN DISTINCT fof.name
MATCH (b:Person {name:'Bob'})-[:FRIEND *..5]->(fof:Person)
RETURN DISTINCT fof.name
MATCH (b:Person {name:'Bob'})-[:FRIEND *2..5]->(fof:Person)
RETURN DISTINCT fof.name
What is Bob’s Bacon # ?
35
MATCH path=shortestPath(
(kb:Person {name:'Kevin Bacon'})-[*]-(b:Person {name:'Bob'})
)
RETURN length(path)
Resources
- Developer Website
- https://neo4j.com/developer/
- Neo4J Docs
- https://neo4j.com/docs/
- Cypher Refcard
- https://neo4j.com/docs/cypher-refcard/
- Neo4J Book (free)
- Me @tbrantb on twitter
36

Introduction to Graph Databases with Neo4J

  • 1.
    INTRODUCTION TO GRAPH DATABASESWITH NEO4J Brant Boehmann @tbrantb
  • 2.
    About Me - ~20Years Dev Experience - Java/JVM Specialist - Lead Software Engineer @ScrippsNet 2
  • 3.
    Outline - DB Types -What is a GraphDB - RDBMS vs GraphDB - Neo4J - Cypher - Neo4J Console 3
  • 4.
    Multiple DB Types -Relational - Oracle, MS-SQL, PostgreSQL, MySQL, ... - Columnar - GreenPlum, RedShift, ... - Key-Value - Dynamo, Riak, ... - Document - MongoDB, CouchDB, ... - Graph - Neo4J, Titan, OrientDB, ... 4
  • 5.
    Which DB typeis right? - Data Size - Data structure - Access Patterns - Write Patterns - Performance Requirements - Consistency Requirements - Degree of Data Connectedness 5
  • 6.
    Remember Graph Basics -Node (Vertex) - Relationship (Edge) - Direction - Path A D B C E 6
  • 7.
    Turn a Graphinto a DB - Node Labels - Relationship Type - Properties 7
  • 8.
    Shoehorn into RDBMSAttempt #1 - Convert a Graph to Table - Adjacency Matrix 8
  • 9.
    Shoehorn into RDBMSAttempt #1 - Adjacency Matrix Doesn’t fit in RDBMS table :( - Why? - Schema / Static Columns - Column growth 9
  • 10.
    Shoehorn into RDBMSAttempt #2 - Join Table - Many-To-Many 10
  • 11.
    Who are Bob’sFriends Pretty Simple SELECT p2.id, p2.name FROM person p1 JOIN person_relationship rel ON p1.id = rel.start_person_id AND p1.id = 1 --Bob AND rel.relationship_type=1 --Friend JOIN person p2 ON p2.id = rel.end_person_id 11
  • 12.
    Who are theFriends of Bob’s Friends? SELECT p3.id, p3.name FROM person p1 JOIN person_relationship rel1 ON p1.id = rel1.start_person_id AND p1.id = 1 --Bob AND rel1.relationship_type = 1 --Friend JOIN person p2 ON p2.id = rel1.end_person_id JOIN person_relationship rel2 ON p2.id = rel2.start_person_id AND rel2.relationship_type = 1 --Friend JOIN person p3 ON p3.id = rel2.end_person_id 12
  • 13.
    Bob’s Friends’ Friends’Friends to the N Do I have to? 13
  • 14.
    What is Bob’sBacon # ? ? 14
  • 15.
    Not all problems arewell suited for RDBMS 15
  • 16.
    How do GraphDBs do this better? - RDBMS uses joins & table/index scans - Native vs Non-Native GraphDB - “Index Free Adjacency” 16
  • 17.
    Popular GraphDB UseCases - Social Networks - Fraud Detection - Recommendation Engines - IAM - Master Data Management (MDM) - Network Ops - Content/Asset Management - Journalism (see Panama Papers) 17
  • 18.
    Algorithms/Concepts - Shortest Path -A* - Triadic Closures 18
  • 19.
    Neo4J - Labeled PropertyGraph - Native Graph DB w/Index Free Adjacency - JVM Based - Fully ACID - All relationships are directed * - Schema-less, Schema-optional - Embedded, Single Instance Server, HA Cluster 19
  • 20.
    Neo4J APIs - EmbeddedJava API - HTTP API - Drivers/Bolt Protocol 20
  • 21.
    Cypher - Declarative GraphQuery Language - Originated at Neo Technology - Standardizing at OpenCypher.org - Ascii Art-ish 21
  • 22.
    Ascii Art-ish ? -() look like circles or nodes - --, -->, <-- look like relationships or edges 22 MATCH (a)-->(b) WHERE a.name = 'Bob' RETURN b
  • 23.
    MATCH Clause 23 MATCH (a)-->(b) MATCH(a)-[:FRIEND]->(b) MATCH (a)<-[:FRIEND]-(b) MATCH (a)-[:FRIEND]-(b)
  • 24.
    MATCH Clause withLabels 24 MATCH (p:Person)-[:DIRECTED]->(m:Movie) MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
  • 25.
    WHERE 25 MATCH (p:Person)-[:DIRECTED]->(m:Movie) WHERE p.name= 'Ron Howard' AND p.born < 1955 MATCH (p:Person {name: 'Ron Howard'})-[:DIRECTED]->(m:Movie)
  • 26.
    RETURN, ORDER BY,LIMIT 26 MATCH (p:Person {name: 'Ron Howard'})-[:DIRECTED]->(m:Movie) RETURN m.title, m.released, m.tagline ORDER BY m.title LIMIT 3
  • 27.
    Overall Read QueryStructure 27 [MATCH WHERE] [OPTIONAL MATCH WHERE] [WITH [ORDER BY] [SKIP] [LIMIT]] RETURN [ORDER BY] [SKIP] [LIMIT]
  • 28.
    CREATE 28 CREATE (matrix:Movie {title:'TheMatrix', released:1999}) CREATE (keanu:Person {name:'Keanu Reeves', born:1964}) CREATE (keanu)-[:ACTED_IN]->(matrix) MATCH (matrix:Movie {title:'The Matrix'}), (keanu:Person {name:'Keanu Reeves'}) CREATE (keanu)-[:ACTED_IN]->(matrix)
  • 29.
    UPDATE 29 MATCH (matrix:Movie {title:'TheMatrix'}), (keanu:Person {name:'Keanu Reeves'}) SET matrix.tagline = 'Follow the White Rabbit', keanu.born = 1900 RETURN matrix, keanu //optional
  • 30.
    DELETE 30 MATCH (p:Person{name: 'KeanuReeves'}) DELETE p MATCH (p:Person{name: 'Keanu Reeves'})-[r]-() DELETE r, p MATCH (p:Person{name: 'Keanu Reeves'}) DETACH DELETE n
  • 31.
  • 32.
    Who are Bob’sFriends SELECT p2.id, p2.name FROM person p1 JOIN person_relationship rel ON p1.id = rel.start_person_id AND p1.id = 1 --Bob AND rel.relationship_type=1 --Friend JOIN person p2 ON p2.id = rel.end_person_id 32 MATCH (b:Person {name:'Bob'})-[:FRIEND]->(f:Person) RETURN f.name
  • 33.
    Who are theFriends of Bob’s Friends? SELECT p3.id, p3.name FROM person p1 JOIN person_relationship rel1 ON p1.id = rel1.start_person_id AND p1.id = 1 --Bob AND rel1.relationship_type = 1 --Friend JOIN person p2 ON p2.id = rel1.end_person_id JOIN person_relationship rel2 ON p2.id = rel2.start_person_id AND rel2.relationship_type = 1 --Friend JOIN person p3 ON p3.id = rel2.end_person_id 33 MATCH (b:Person {name:'Bob'})-[:FRIEND]->(f:Person)-[:FRIEND]->(fof:Person) RETURN DISTINCT fof.name MATCH (b:Person {name:'Bob'})-[:FRIEND *2]->(fof:Person) RETURN DISTINCT fof.name
  • 34.
    Bob’s Friends’ Friends’Friends to the N 34 MATCH (b:Person {name:'Bob'})-[:FRIEND *2]->(fof:Person) RETURN DISTINCT fof.name MATCH (b:Person {name:'Bob'})-[:FRIEND *]->(fof:Person) RETURN DISTINCT fof.name MATCH (b:Person {name:'Bob'})-[:FRIEND *..5]->(fof:Person) RETURN DISTINCT fof.name MATCH (b:Person {name:'Bob'})-[:FRIEND *2..5]->(fof:Person) RETURN DISTINCT fof.name
  • 35.
    What is Bob’sBacon # ? 35 MATCH path=shortestPath( (kb:Person {name:'Kevin Bacon'})-[*]-(b:Person {name:'Bob'}) ) RETURN length(path)
  • 36.
    Resources - Developer Website -https://neo4j.com/developer/ - Neo4J Docs - https://neo4j.com/docs/ - Cypher Refcard - https://neo4j.com/docs/cypher-refcard/ - Neo4J Book (free) - Me @tbrantb on twitter 36