Graph databases are used to represent graph structures with nodes, edges and properties. Neo4j, an open-source graph database is reliable and fast for managing and querying highly connected data. Will explore how to install and configure, create nodes and relationships, query with the Cypher Query Language, importing data and using Neo4j in concert with SQL Server... Providing answers and insight with visual diagrams about connected data that you have in your SQL Server Databases!
1. Graph Databases
for SQL Server Professionals
Stéphane Fréchette
Thursday September 18, 2014
2. Who am I?
My name is Stéphane Fréchette
SQL Server MVP | Consultant | Speaker | Database & BI Architect | NoSQL.
Drums, good food and fine wine. Founder @ukubu, @GatineauOuverte,
@TEDxGatineau
I have a passion for architecting, designing and building solutions that
matter.
Twitter: @sfrechette
Blog: stephanefrechette.com
Email: stephanefrechette@ukubu.com
3. Session Outline
• What is a Graph?
• What is Neo4j?
• Data Modeling – The Property Graph
• Cypher Query Language
• Importing Data…
• Use Cases
• Demos
• Resources
9. What is Neo4j?
An open-source graph database by Neo
Technology. Neo4j stores data in nodes
connected by directed, typed relationships
with properties on both, also know as a
Property Graph
• Fully ACID compliant
• Massively scalable, up to several billion
nodes/relationships/properties
• Highly-available, when distributed across multiple
machines
• Accessible by a convenient REST interface or an
object-oriented Java API
11. Example: Meetup Data In SQL Server
Member MeetupOrganizer MeetupMember Meetup
ID Member
1 Daniel
2 Stephane
3 John
4 Randy
ID Name
1 Ottawa SQL Server User Group
2 Ottawa JavaScript
3 Ottawa Visio User Group
4 Ottawa Tableau User Group
5 Dirty Dancing Ottawa
MemberID MeetupID
2 1
1 2
3 3
2 4
3 5
MemberID MeetupID
3 1
3 2
4 2
4 4
1 5
12. Example: Meetup Data In a Graph Member Meetup
name: ‘Stephane’
name: ‘Ottawa Tableau User Group’
name: ‘Ottawa SQL Server User Group’
name: ‘John’
name: ‘Ottawa JavaScript’
name: ‘Ottawa Visio User Group’
name: ‘Dirty Dancing Ottawa’
name: ‘Randy’
name: ‘Daniel’
13. Cypher Query Language
Cypher is a declarative graph query language that allows for expressive and
efficient querying and updating of the graph store
• Pattern-matching
• Declarative: what to retrieve, not how to retrieve it
• Inspired from other known Language (SQL, SPARQL, Haskell, Python)
• Aggregation, Ordering, Limit
• Update the Graph
14. Cypher and T-SQL
Cypher also has a number of keywords that have a direct equivalence with SQL
which makes it a curiously familiar language
• WHERE
• ORDER BY
• LIMIT
• SUM, COUNT, STDEVP, MIN, MAX etc…
• LTRIM, UPPER, LOWER, REPLACE, LEFT, RIGHT, SUBSTRING
• DISTINCT
• CASE
(SQL Server Pros) – [:WILL_LOVE] -> (Cypher)
19. Importing Data…
Some important considerations…
Different import scenarios
• Dataset size: 1000s, 100000s, 10000000s
• Dataset format (source): Database, File (CSV, Spreadsheet, GraphML, Geoff), Service, Other
• Import type: Initial Bulk Load, Incremental Load, Initial Bulk Load + Incremental Load
Different import tools
• Spreadsheet based
• Neo4j-shell based: (Cypher, neo4j-shell-tools, Cypher LOAD CSV)
• Command-line based: Batch Importer
• Neo4j Brower based
• ETL Tools: (Talend, Mulesoft, Pentaho Kettle)
• Custom software: (Java API, REST API, Spring Data Neo4j)
20. Many different mappings
Import
Scenarios
Import Tools
Not always clear what you should be using
Depends on your skillsets, dataset size… (lots of other stuff)
Choose wisely!
23. Importing using Spreadsheets
Very small size datasets < 1000, easy to use
Format data in
spreadsheet
Generate Cypher
statements with
formulas
Copy and Execute
Cypher in Neo4j
browser
25. Importing using neo4j-shell-tools
Small to medium size datasets
https://github.com/jexp/neo4j-shell-tools
Format data in CSV
files
Create import-cypher
commands for
neo4j-shell-tools
Execute commands
from neo4j-shell
27. Importing using LOAD CSV
Native Cypher
Format data in
CSV files
Create
“LOAD CSV”
commands
Execute
command from
neo4j-shell or
browser
Additional
“cleanup” for
Labels and
RelTypes
29. Importing using Batch Importer
Non-transactional import, suited for very very large datasets
Format data in
TSV files
Execute Batch
Import command
Copy store files
to Neo4j Server
directory
Start Neo4j Server
with generated
store files
30. Use Cases
Principal uses of Graph Database include
• Network and Data Center Management
(Queries: Impact Analysis, Root Cause Analysis, Quality-of-Service Mapping, Asset Management)
• Authorization and Access
(Queries : Access Management, Interconnected Group Organization, Provenance)
• Social
(Queries : Friend Recommendations, Sharing & Collaboration, Influencer Analysis)
• Geo
(Queries : Routing, Logistics, Capacity Planning)
• Recommendations
(Queries : Product, Social, Service, and Professional Recommendations)
• Fraud Detection
http://www.neotechnology.com/neo4j-use-cases/