RDBMS TO GRAPH
Live from San Mateo, March 9, 2016
Webinar
Data used to be stored like this: punch tape. Or punch cards. 

Horrible way to read and understand data.

Impossible to index easily, cross-reference, eliminate inconsistencies and cross-reference.
Then we started storing data in tables, and “relational” databases.

Sometimes those tables are human-readable.

But as soon as you normalize the data to eliminate duplication and inconsistencies, many fields start referencing auto-generated
numerical foreign keys. And your data becomes difficult to understand and maintain without complicated JOIN queries.
ACCOUNT
HOLDER 2
ACCOUNT
HOLDER 1
ACCOUNT
HOLDER 3
CREDIT
CARD
BANK
ACCOUNT
BANK
ACCOUNT
BANK
ACCOUNT
ADDRESS
PHONE NUMBER
PHONE NUMBER
SSN 2
UNSECURE LOAN
SSN 2
UNSECURE LOAN
CREDIT
CARD
Enter Graph Databases. The future is now.

Graph Databases, like Neo4j, store data in a much more logical way. A way that represents the real world, and prioritizes the
representation, discoverability and maintainability of data relationships.
Intuitivness
Speed
Agility
Intuitiveness
Speed
Agility
Intuitiveness
Intuitivness
Speed
Agility
Speed
“We found Neo4j to be literally thousands of times faster
than our prior MySQL solution, with queries that require
10-100 times less code. Today, Neo4j provides eBay with
functionality that was previously impossible.”
- Volker Pacher, Senior Developer
“Minutes to milliseconds” performance
Queries up to 1000x faster than RDBMS or other NoSQL
Intuitivness
Speed
Agility
A Naturally Adaptive Model
A Query Language Designed
for Connectedness
+
=Agility
Cypher
Typical Complex SQL Join The Same Query using Cypher
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate, 

count(report) AS Total
Project Impact
Less time writing queries
Less time debugging queries
Code that’s easier to read
Less time writing queries

More time understanding the answers

Leaving time to ask the next question

Less time debugging queries: 

More time writing the next piece of code

Improved quality of overall code base

Code that’s easier to read:

Faster ramp-up for new project members

Improved maintainability & troubleshooting
ABOUT ME
• Developed web apps for 5 years
including e-commerce, business
workflow, more.
• Worked at Google for 8 years on
Google Apps, Cloud Platform
• Technologies: Python, Java,
BigQuery, Oracle, MySQL, OAuth
ryan@neo4j.com
@ryguyrg
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Real Time Recommendations
VIEWED
VIEWED
BOUGHT
VIEWED
BOUGHT
BOUGHT
BOUGHT
BOUGHT
Real-Time Recommendations could be about finding the relationsships relevant to make recommend a product or a service…. 

…which is exactly why Walmart is using Neo4j.
“As the current market leader in graph databases,
and with enterprise features for scalability and
availability, Neo4j is the right choice to meet our
demands.” Marcos Wada
Software Developer, Walmart
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Master Data Management
MANAGES
MANAGES
LEADS
REGION
M
ANAG
ES
MANAGES
REGION
LEADS
LEADS
COLLABORATES
Master Data Management is about bringing together all the entities within an organization and external to the organization. 

To understand the relationship between each of them.
Neo4j is the heart of Cisco HMP: used for governance
and single source of truth and a one-stop shop for all
of Cisco’s hierarchies.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
Cisco uses it for this — to power their content management, resources and knowledge-base articles for use by sales teams. It also powers product recommendations to make sure customers are getting the power of their offerings.

Although this project is focused on sales teams, another group has used Neo4j to power all of their helpdesk content -
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Master Data Management
Solu%on	
Support	
Case	
Support	
Case	
Knowledge	
Base	Ar%cle	
Message	
Knowledge	
Base	Ar%cle	
Knowledge	
Base	Ar%cle	
Neo4j is the heart of Cisco’s Helpdesk Solution too.
Master Data Management is about bringing together all the entities within an organization and external to the organization. 

To understand the relationship between each of them.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Fraud Detection
O
PENED_ACCO
UNT
HAS
IS_ISSUED
HAS
LIVES
LIVES
IS_ISSUED
OPENED_ACCOUNT
Discovering fraud is another use case that is particularly suitable to graphs, because it’s all about about finding fraudulent patterns. Here we work with the top banks and insurance companies as well as many governments..
“Graph databases offer new methods of uncovering
fraud rings and other sophisticated scams with a
high-level of accuracy, and are capable of stopping
advanced fraud scenarios in real-time.”
Gorka Sadowski
Cyber Security Expert
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Graph Based Search
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
PUBLISH
INCLUDE
INCLUDE
CREATE
CAPTURE
IN
IN
SOURCE
USES
USES
IN
IN
USES
SOURCE
SOURCE
Uses Neo4j to manage the digital assets inside of its next
generation in-flight entertainment system.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
BROWSES
CONNECTS
BRIDGES
ROUTES
POWERS
ROUTES
POWERS
POWERS
HOSTS
QUERIES
GRAPH THINKING:
Network & IT-Operations
Decency analysis

Root cause analysis
Uses Neo4j for network topology analysis
for big telco service providers
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
GRAPH THINKING:
Identity And Access Management
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
TRUSTS
TRUSTS
ID
ID
AUTHENTICATES
AUTHENTICATES
O
W
NS
OWNS
CAN_READ
Think of organizational hierarchies. No longer is it just a tree.
UBS was the recipient of the 2014
Graphie Award for “Best Identify And
Access Management App”
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
Neo4j Adoption by Selected Verticals
SOFTWARE
FINANCIAL
SERVICES
RETAIL
MEDIA &
BROADCASTING
SOCIAL
NETWORKS
TELECOM HEALTHCARE
AGENDA
• Use Cases
• SQL Pains
• Building a Neo4j Application
• Moving from RDBMS -> Graph Models
• Walk through an Example
• Creating Data in Graphs
• Querying Data
I hired this kid for all the handwriting you’ll see throughout the presentation.

So, don’t blame me.
SQL
Day in the Life of a RDBMS Developer
Let’s explore how your SQL developer works today.
They work with data in tables. 

Here’s a table of people and where they're from, their hair color and the university they attended.

This table is fairly natural, but duplicating values across multiple rows. Let’s say you want to change the name of a university or a
country, you’d have to update all rows.
So, instead, you’d create a separate table for the country, with an ID that references it. This is your primary key.
This allows you to add additional properties.
Now, you use that ID to reference the country in the people table - a foreign key.
And you’d want to normalize the university table as well.
And use the university ID to reference it. Now your table it a lot less readable.
So, we see this set of 3 tables with arrows indicating references between primary keys and foreign keys, used in JOINs.
SELECT
p.name,
c.country, c.leader, p.hair,
u.name, u.pres, u.state
FROM
people p
LEFT JOIN country c ON c.ID=p.country
LEFT JOIN uni u ON p.uni=u.id
WHERE
u.state=‘CT’
Your SQL looks like this.

Only, this is a super simple JOIN across 3 tables. I’ve often had to work with 10+ tables being JOINed.
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
And your SQL developer? All she’s thinking about is joins
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
All day long.
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
JOIN
And they’re keeping her up at night as well.
Meanwhile, it’s expensive to find data.

So we add indexes to make it easier.

But when we have to do index lookups for each and every JOIN?

And we have a dozen JOINs?

That’s expensive.
What’s the solution? 

Denormalize! But now hard to maintain and have consistent data.
• Complex to model and store relationships
• Performance degrades with increases in data
• Queries get long and complex
• Maintenance is painful
SQL Pains
• Easy to model and store relationships
• Performance of relationship traversal remains constant with
growth in data size
• Queries are shortened and more readable
• Adding additional properties and relationships can be done on
the fly - no migrations
Graph Gains
John Resig, who you may know as the creator of jQuery, loves Neo4j because it simplifies life.
What does this Graph look like?
So you’ve seen what tables look like. How do graphs make this better?
CYPHER
Ann DanLoves
The obligatory “Ann Loves Dan” example
Property Graph Model
CREATE (:Person { name:“Dan”} ) - [:LOVES]-> (:Person { name:“Ann”} )
LOVES
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
The whiteboard model is the physical model.
MATCH
(p:Person)-[:WENT_TO]->(u:Uni),
(p)-[:LIVES_IN]->(c:Country),
(u)-[:LED_BY]->(l:Leader),
(u)-[:LOCATED_IN]->(s:State)
WHERE
s.abbr = ‘CT’
RETURN
p.name,
c.country, c.leader, p.hair,
u.name, l.name, s.abbr
David Meza, chief knowledge architect at NASA, had this to say.
How do you use Neo4j?
CREATE MODEL
+
LOAD DATA QUERY DATA
Create your Graph Model

Load your data 

Query your data
How do you use Neo4j?
Querying can be done in the Neo4j Browser.
Querying can be done in the Neo4j Browser.
How do you use Neo4j?
Language Drivers
javascript, java, ruby, .net, python, php
Language Drivers
haskell, go
Native Server-Side Extensions
Need to get every last ounce of performance?

You can write server-side extensions in Java.
Architectural Options
Data	Storage	and	
Business	Rules	Execu5on	
Data	Mining		
and	Aggrega5on	
Applica'on	
Graph	Database	Cluster	
Neo4j	 Neo4j	 Neo4j	
Ad	Hoc	
Analysis	
Bulk	Analy'c	
Infrastructure	
Hadoop,	EDW			…	
Data	
Scien'st	
End	User	
Databases	
Rela5onal	
NoSQL	
Hadoop
RDBMS to Graph Options
MIGRATE		
ALL	DATA	
MIGRATE		
SUBSET	
DUPLICATE	
SUBSET	
Non-Graph	Queries	 Graph	Queries	
Graph	Queries	Non-Graph	Queries	
All	Queries	
Rela3onal	
Database	
Graph	
Database	
Application
Application
Application
Non	Graph	
Data	
All	Data
FROM RDBMS TO GRAPHS
Northwind
Northwind - the canonical RDBMS Example
( )-[:TO]->(Graph)
( )-[:IS_BETTER_AS]->(Graph)
Starting with the ER Diagram
Locate the Foreign Keys
Drop the Foreign Keys
Find the JOIN Tables
(Simple) JOIN Tables Become Relationships
Attributed JOIN Tables -> Relationships with Properties
Querying a Subset Today
As a Graph
QUERYING THE GRAPH
using openCypher
Declarative query language

Easy to learn for someone familiar with languages like SQL

But optimized for graphs, and quickly readable
Property Graph Model
CREATE	(:Employee{	firstName:“Steven”}	)	-[:REPORTS_TO]->	(:Employee{	firstName:“Andrew”}	)		
REPORTS_TO
Steven	 Andrew	
LABEL	 PROPERTY	
NODE	 NODE	
LABEL	 PROPERTY
Who do people report to?
MATCH
(e:Employee)<-[:REPORTS_TO]-(sub:Employee)
RETURN
*
Who do people report to?
Results can be returned as nodes and relationships
Who do people report to?
MATCH
(e:Employee)<-[:REPORTS_TO]-(sub:Employee)
RETURN
e.employeeID AS managerID,
e.firstName AS managerName,
sub.employeeID AS employeeID,
sub.firstName AS employeeName;
or alternatively as a table.
Who do people report to?
Who does Robert report to?
MATCH
p=(e:Employee)<-[:REPORTS_TO]-(sub:Employee)
WHERE
sub.firstName = ‘Robert’
RETURN
p
Who does Robert report to?
What is Robert’s reporting chain?
MATCH
p=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee)
WHERE
sub.firstName = ‘Robert’
RETURN
p
But the power of the graph is in the ability to query arbitrary length paths.

See the asterisks.
What is Robert’s reporting chain?
Who’s the Big Boss?
MATCH
(e:Employee)
WHERE
NOT (e)-[:REPORTS_TO]->()
RETURN
e.firstName as bigBoss
Who’s the Big Boss?
Product Cross-Selling
MATCH
(choc:Product {productName: 'Chocolade'})
<-[:INCLUDES]-(:Order)<-[:SOLD]-(employee),
(employee)-[:SOLD]->(o2)-[:INCLUDES]->(other:Product)
RETURN
employee.firstName,
other.productName,
COUNT(DISTINCT o2) as count
ORDER BY
count DESC
LIMIT 5;
Product Cross-Selling
(ASIDE ON GRAPH COMPUTE)
Optimized for OLTP

But can be used for Graph Compute

Either with built-in functions

Or server-side extensions

Or via exporting data to spark / graphx for analysis
Shortest Path Between Airports
MATCH
p = shortestPath(
(a:Airport {code:”SFO”})-[*0..2]->
(b:Airport {code: “MSO”}))
RETURN
p
Example using built-in algorithms.

Dijkstra also available for weighted paths
(END ASIDE ON GRAPH COMPUTE)
POWERING AN APP
Simple App
Simple App
Simple Python Code
Simple Python Code
Simple Python Code
Simple Python Code
LOADING OUR DATA
CSV
CSV files for Northwind
CSV files for Northwind
3 Steps to Creating the Graph
IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS
Importing Nodes
// Create customers
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/customers.csv" AS row
CREATE (:Customer {companyName: row.CompanyName, customerID:
row.CustomerID, fax: row.Fax, phone: row.Phone});
// Create products
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/products.csv" AS row
CREATE (:Product {productName: row.ProductName, productID:
row.ProductID, unitPrice: toFloat(row.UnitPrice)});
Importing Nodes
// Create suppliers
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/suppliers.csv" AS row
CREATE (:Supplier {companyName: row.CompanyName, supplierID:
row.SupplierID});
// Create employees
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/employees.csv" AS row
CREATE (:Employee {employeeID:row.EmployeeID, firstName:
row.FirstName, lastName: row.LastName, title: row.Title});
Importing Nodes
// Create categories
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/categories.csv" AS row
CREATE (:Category {categoryID: row.CategoryID, categoryName:
row.CategoryName, description: row.Description});
// Create orders
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/orders.csv" AS row
MERGE (order:Order {orderID: row.OrderID}) ON CREATE SET
order.shipName = row.ShipName;
Creating Indexes
CREATE INDEX ON :Product(productID);
CREATE INDEX ON :Product(productName);
CREATE INDEX ON :Category(categoryID);
CREATE INDEX ON :Employee(employeeID);
CREATE INDEX ON :Supplier(supplierID);
CREATE INDEX ON :Customer(customerID);
CREATE INDEX ON :Customer(customerName);
Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (customer:Customer {customerID: row.CustomerID})
MERGE (customer)-[:PURCHASED]->(order);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://
raw.githubusercontent.com/neo4j-contrib/developer-resources/
gh-pages/data/northwind/products.csv" AS row
MATCH (product:Product {productID: row.ProductID})
MATCH (supplier:Supplier {supplierID: row.SupplierID})
MERGE (supplier)-[:SUPPLIES]->(product);
Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-
contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (product:Product {productID: row.ProductID})
MERGE (order)-[pu:INCLUDES]->(product)
ON CREATE SET pu.unitPrice = toFloat(row.UnitPrice), pu.quantity =
toFloat(row.Quantity);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-
contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row
MATCH (order:Order {orderID: row.OrderID})
MATCH (employee:Employee {employeeID: row.EmployeeID})
MERGE (employee)-[:SOLD]->(order);
Creating Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/
neo4j-contrib/developer-resources/gh-pages/data/northwind/
products.csv" AS row
MATCH (product:Product {productID: row.ProductID})
MATCH (category:Category {categoryID: row.CategoryID})
MERGE (product)-[:PART_OF]->(category);
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/
neo4j-contrib/developer-resources/gh-pages/data/northwind/
employees.csv" AS row
MATCH (employee:Employee {employeeID: row.EmployeeID})
MATCH (manager:Employee {employeeID: row.ReportsTo})
MERGE (employee)-[:REPORTS_TO]->(manager);
High Performance LOADing
neo4j-import
4.58 million things

and their relationships…

Loads in 100 seconds!
WRAPPING UP
“We found Neo4j to be literally thousands of times faster
than our prior MySQL solution, with queries that require
10 to 100 times less code. Today, Neo4j provides eBay
with functionality that was previously impossible.”
Volker Pacher

Senior Developer
THANK YOU!
Ryan Boyd
@ryguyrg ryan@neo4j.com
Thank you for listening!

RDBMS to Graph Webinar

  • 1.
    RDBMS TO GRAPH Livefrom San Mateo, March 9, 2016 Webinar
  • 2.
    Data used tobe stored like this: punch tape. Or punch cards. Horrible way to read and understand data. Impossible to index easily, cross-reference, eliminate inconsistencies and cross-reference.
  • 3.
    Then we startedstoring data in tables, and “relational” databases. Sometimes those tables are human-readable. But as soon as you normalize the data to eliminate duplication and inconsistencies, many fields start referencing auto-generated numerical foreign keys. And your data becomes difficult to understand and maintain without complicated JOIN queries.
  • 4.
    ACCOUNT HOLDER 2 ACCOUNT HOLDER 1 ACCOUNT HOLDER3 CREDIT CARD BANK ACCOUNT BANK ACCOUNT BANK ACCOUNT ADDRESS PHONE NUMBER PHONE NUMBER SSN 2 UNSECURE LOAN SSN 2 UNSECURE LOAN CREDIT CARD Enter Graph Databases. The future is now. Graph Databases, like Neo4j, store data in a much more logical way. A way that represents the real world, and prioritizes the representation, discoverability and maintainability of data relationships.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    Speed “We found Neo4jto be literally thousands of times faster than our prior MySQL solution, with queries that require 10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.” - Volker Pacher, Senior Developer “Minutes to milliseconds” performance Queries up to 1000x faster than RDBMS or other NoSQL
  • 10.
  • 11.
    A Naturally AdaptiveModel A Query Language Designed for Connectedness + =Agility
  • 12.
    Cypher Typical Complex SQLJoin The Same Query using Cypher MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, 
 count(report) AS Total Project Impact Less time writing queries Less time debugging queries Code that’s easier to read Less time writing queries More time understanding the answers Leaving time to ask the next question Less time debugging queries: More time writing the next piece of code Improved quality of overall code base Code that’s easier to read: Faster ramp-up for new project members Improved maintainability & troubleshooting
  • 13.
    ABOUT ME • Developedweb apps for 5 years including e-commerce, business workflow, more. • Worked at Google for 8 years on Google Apps, Cloud Platform • Technologies: Python, Java, BigQuery, Oracle, MySQL, OAuth ryan@neo4j.com @ryguyrg
  • 14.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 15.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Real Time Recommendations VIEWED VIEWED BOUGHT VIEWED BOUGHT BOUGHT BOUGHT BOUGHT Real-Time Recommendations could be about finding the relationsships relevant to make recommend a product or a service…. …which is exactly why Walmart is using Neo4j.
  • 16.
    “As the currentmarket leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands.” Marcos Wada Software Developer, Walmart NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 17.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Master Data Management MANAGES MANAGES LEADS REGION M ANAG ES MANAGES REGION LEADS LEADS COLLABORATES Master Data Management is about bringing together all the entities within an organization and external to the organization. To understand the relationship between each of them.
  • 18.
    Neo4j is theheart of Cisco HMP: used for governance and single source of truth and a one-stop shop for all of Cisco’s hierarchies. NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations Cisco uses it for this — to power their content management, resources and knowledge-base articles for use by sales teams. It also powers product recommendations to make sure customers are getting the power of their offerings. Although this project is focused on sales teams, another group has used Neo4j to power all of their helpdesk content -
  • 19.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Master Data Management Solu%on Support Case Support Case Knowledge Base Ar%cle Message Knowledge Base Ar%cle Knowledge Base Ar%cle Neo4j is the heart of Cisco’s Helpdesk Solution too. Master Data Management is about bringing together all the entities within an organization and external to the organization. To understand the relationship between each of them.
  • 20.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Fraud Detection O PENED_ACCO UNT HAS IS_ISSUED HAS LIVES LIVES IS_ISSUED OPENED_ACCOUNT Discovering fraud is another use case that is particularly suitable to graphs, because it’s all about about finding fraudulent patterns. Here we work with the top banks and insurance companies as well as many governments..
  • 21.
    “Graph databases offernew methods of uncovering fraud rings and other sophisticated scams with a high-level of accuracy, and are capable of stopping advanced fraud scenarios in real-time.” Gorka Sadowski Cyber Security Expert NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 22.
    GRAPH THINKING: Graph BasedSearch NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations PUBLISH INCLUDE INCLUDE CREATE CAPTURE IN IN SOURCE USES USES IN IN USES SOURCE SOURCE
  • 23.
    Uses Neo4j tomanage the digital assets inside of its next generation in-flight entertainment system. NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 24.
    NEO4j USE CASES RealTime Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations BROWSES CONNECTS BRIDGES ROUTES POWERS ROUTES POWERS POWERS HOSTS QUERIES GRAPH THINKING: Network & IT-Operations Decency analysis Root cause analysis
  • 25.
    Uses Neo4j fornetwork topology analysis for big telco service providers NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 26.
    GRAPH THINKING: Identity AndAccess Management NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations TRUSTS TRUSTS ID ID AUTHENTICATES AUTHENTICATES O W NS OWNS CAN_READ Think of organizational hierarchies. No longer is it just a tree.
  • 27.
    UBS was therecipient of the 2014 Graphie Award for “Best Identify And Access Management App” NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
  • 28.
    Neo4j Adoption bySelected Verticals SOFTWARE FINANCIAL SERVICES RETAIL MEDIA & BROADCASTING SOCIAL NETWORKS TELECOM HEALTHCARE
  • 29.
    AGENDA • Use Cases •SQL Pains • Building a Neo4j Application • Moving from RDBMS -> Graph Models • Walk through an Example • Creating Data in Graphs • Querying Data
  • 30.
    I hired thiskid for all the handwriting you’ll see throughout the presentation. So, don’t blame me.
  • 31.
    SQL Day in theLife of a RDBMS Developer Let’s explore how your SQL developer works today.
  • 32.
    They work withdata in tables. Here’s a table of people and where they're from, their hair color and the university they attended. This table is fairly natural, but duplicating values across multiple rows. Let’s say you want to change the name of a university or a country, you’d have to update all rows.
  • 33.
    So, instead, you’dcreate a separate table for the country, with an ID that references it. This is your primary key.
  • 34.
    This allows youto add additional properties.
  • 35.
    Now, you usethat ID to reference the country in the people table - a foreign key.
  • 36.
    And you’d wantto normalize the university table as well.
  • 37.
    And use theuniversity ID to reference it. Now your table it a lot less readable.
  • 38.
    So, we seethis set of 3 tables with arrows indicating references between primary keys and foreign keys, used in JOINs.
  • 39.
    SELECT p.name, c.country, c.leader, p.hair, u.name,u.pres, u.state FROM people p LEFT JOIN country c ON c.ID=p.country LEFT JOIN uni u ON p.uni=u.id WHERE u.state=‘CT’ Your SQL looks like this. Only, this is a super simple JOIN across 3 tables. I’ve often had to work with 10+ tables being JOINed.
  • 40.
  • 41.
  • 42.
  • 43.
    Meanwhile, it’s expensiveto find data. So we add indexes to make it easier. But when we have to do index lookups for each and every JOIN? And we have a dozen JOINs? That’s expensive.
  • 44.
    What’s the solution? Denormalize! But now hard to maintain and have consistent data.
  • 45.
    • Complex tomodel and store relationships • Performance degrades with increases in data • Queries get long and complex • Maintenance is painful SQL Pains
  • 46.
    • Easy tomodel and store relationships • Performance of relationship traversal remains constant with growth in data size • Queries are shortened and more readable • Adding additional properties and relationships can be done on the fly - no migrations Graph Gains
  • 47.
    John Resig, whoyou may know as the creator of jQuery, loves Neo4j because it simplifies life.
  • 48.
    What does thisGraph look like? So you’ve seen what tables look like. How do graphs make this better?
  • 49.
    CYPHER Ann DanLoves The obligatory“Ann Loves Dan” example
  • 50.
    Property Graph Model CREATE(:Person { name:“Dan”} ) - [:LOVES]-> (:Person { name:“Ann”} ) LOVES LABEL PROPERTY NODE NODE LABEL PROPERTY The whiteboard model is the physical model.
  • 52.
  • 53.
    David Meza, chiefknowledge architect at NASA, had this to say.
  • 54.
    How do youuse Neo4j? CREATE MODEL + LOAD DATA QUERY DATA Create your Graph Model Load your data Query your data
  • 55.
    How do youuse Neo4j? Querying can be done in the Neo4j Browser.
  • 56.
    Querying can bedone in the Neo4j Browser.
  • 57.
    How do youuse Neo4j?
  • 58.
    Language Drivers javascript, java,ruby, .net, python, php
  • 59.
  • 60.
    Native Server-Side Extensions Needto get every last ounce of performance? You can write server-side extensions in Java.
  • 62.
    Architectural Options Data Storage and Business Rules Execu5on Data Mining and Aggrega5on Applica'on Graph Database Cluster Neo4j Neo4j Neo4j Ad Hoc Analysis Bulk Analy'c Infrastructure Hadoop, EDW … Data Scien'st End User Databases Rela5onal NoSQL Hadoop
  • 63.
    RDBMS to GraphOptions MIGRATE ALL DATA MIGRATE SUBSET DUPLICATE SUBSET Non-Graph Queries Graph Queries Graph Queries Non-Graph Queries All Queries Rela3onal Database Graph Database Application Application Application Non Graph Data All Data
  • 64.
  • 67.
  • 68.
    Northwind - thecanonical RDBMS Example
  • 69.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
    (Simple) JOIN TablesBecome Relationships
  • 77.
    Attributed JOIN Tables-> Relationships with Properties
  • 78.
  • 79.
  • 80.
  • 81.
    using openCypher Declarative querylanguage Easy to learn for someone familiar with languages like SQL But optimized for graphs, and quickly readable
  • 82.
  • 83.
    Who do peoplereport to? MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee) RETURN *
  • 84.
    Who do peoplereport to? Results can be returned as nodes and relationships
  • 85.
    Who do peoplereport to? MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee) RETURN e.employeeID AS managerID, e.firstName AS managerName, sub.employeeID AS employeeID, sub.firstName AS employeeName; or alternatively as a table.
  • 86.
    Who do peoplereport to?
  • 87.
    Who does Robertreport to? MATCH p=(e:Employee)<-[:REPORTS_TO]-(sub:Employee) WHERE sub.firstName = ‘Robert’ RETURN p
  • 88.
    Who does Robertreport to?
  • 89.
    What is Robert’sreporting chain? MATCH p=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee) WHERE sub.firstName = ‘Robert’ RETURN p But the power of the graph is in the ability to query arbitrary length paths. See the asterisks.
  • 90.
    What is Robert’sreporting chain?
  • 91.
    Who’s the BigBoss? MATCH (e:Employee) WHERE NOT (e)-[:REPORTS_TO]->() RETURN e.firstName as bigBoss
  • 92.
  • 93.
    Product Cross-Selling MATCH (choc:Product {productName:'Chocolade'}) <-[:INCLUDES]-(:Order)<-[:SOLD]-(employee), (employee)-[:SOLD]->(o2)-[:INCLUDES]->(other:Product) RETURN employee.firstName, other.productName, COUNT(DISTINCT o2) as count ORDER BY count DESC LIMIT 5;
  • 94.
  • 95.
    (ASIDE ON GRAPHCOMPUTE) Optimized for OLTP But can be used for Graph Compute Either with built-in functions Or server-side extensions Or via exporting data to spark / graphx for analysis
  • 96.
    Shortest Path BetweenAirports MATCH p = shortestPath( (a:Airport {code:”SFO”})-[*0..2]-> (b:Airport {code: “MSO”})) RETURN p Example using built-in algorithms. Dijkstra also available for weighted paths
  • 97.
    (END ASIDE ONGRAPH COMPUTE)
  • 98.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
    CSV files forNorthwind
  • 108.
    CSV files forNorthwind
  • 109.
    3 Steps toCreating the Graph IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS
  • 110.
    Importing Nodes // Createcustomers USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/customers.csv" AS row CREATE (:Customer {companyName: row.CompanyName, customerID: row.CustomerID, fax: row.Fax, phone: row.Phone}); // Create products USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/products.csv" AS row CREATE (:Product {productName: row.ProductName, productID: row.ProductID, unitPrice: toFloat(row.UnitPrice)});
  • 111.
    Importing Nodes // Createsuppliers USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/suppliers.csv" AS row CREATE (:Supplier {companyName: row.CompanyName, supplierID: row.SupplierID}); // Create employees USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/employees.csv" AS row CREATE (:Employee {employeeID:row.EmployeeID, firstName: row.FirstName, lastName: row.LastName, title: row.Title});
  • 112.
    Importing Nodes // Createcategories USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/categories.csv" AS row CREATE (:Category {categoryID: row.CategoryID, categoryName: row.CategoryName, description: row.Description}); // Create orders USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/orders.csv" AS row MERGE (order:Order {orderID: row.OrderID}) ON CREATE SET order.shipName = row.ShipName;
  • 113.
    Creating Indexes CREATE INDEXON :Product(productID); CREATE INDEX ON :Product(productName); CREATE INDEX ON :Category(categoryID); CREATE INDEX ON :Employee(employeeID); CREATE INDEX ON :Supplier(supplierID); CREATE INDEX ON :Customer(customerID); CREATE INDEX ON :Customer(customerName);
  • 114.
    Creating Relationships USING PERIODICCOMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/orders.csv" AS row MATCH (order:Order {orderID: row.OrderID}) MATCH (customer:Customer {customerID: row.CustomerID}) MERGE (customer)-[:PURCHASED]->(order); USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/products.csv" AS row MATCH (product:Product {productID: row.ProductID}) MATCH (supplier:Supplier {supplierID: row.SupplierID}) MERGE (supplier)-[:SUPPLIES]->(product);
  • 115.
    Creating Relationships USING PERIODICCOMMIT LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j- contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row MATCH (order:Order {orderID: row.OrderID}) MATCH (product:Product {productID: row.ProductID}) MERGE (order)-[pu:INCLUDES]->(product) ON CREATE SET pu.unitPrice = toFloat(row.UnitPrice), pu.quantity = toFloat(row.Quantity); USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j- contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS row MATCH (order:Order {orderID: row.OrderID}) MATCH (employee:Employee {employeeID: row.EmployeeID}) MERGE (employee)-[:SOLD]->(order);
  • 116.
    Creating Relationships USING PERIODICCOMMIT LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/ neo4j-contrib/developer-resources/gh-pages/data/northwind/ products.csv" AS row MATCH (product:Product {productID: row.ProductID}) MATCH (category:Category {categoryID: row.CategoryID}) MERGE (product)-[:PART_OF]->(category); USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/ neo4j-contrib/developer-resources/gh-pages/data/northwind/ employees.csv" AS row MATCH (employee:Employee {employeeID: row.EmployeeID}) MATCH (manager:Employee {employeeID: row.ReportsTo}) MERGE (employee)-[:REPORTS_TO]->(manager);
  • 117.
    High Performance LOADing neo4j-import 4.58million things and their relationships… Loads in 100 seconds!
  • 118.
  • 119.
    “We found Neo4jto be literally thousands of times faster than our prior MySQL solution, with queries that require 10 to 100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.” Volker Pacher
 Senior Developer
  • 120.
    THANK YOU! Ryan Boyd @ryguyrgryan@neo4j.com Thank you for listening!