SlideShare a Scribd company logo
Relational to Graph
Importing Data into Neo4j
June 2015
Michael Hunger
michael@neo4j.org |@mesirii
Agenda
• Review Webinar Series
• Importing Data into Neo4j
• Getting Data from RDBMS
• Concrete Examples
• Demo
• Q&A
Webinar Review
Relational to Graph
Webinar Review – Relational to Graph
• Introduction and Overview
• Introduction of Neo4j, Solving RDBMS Issues, Northwind Demo
• Modeling Concerns
• Modeling in Graphs and RDBMS, Good Modeling Practices,
• Model first, incremental Modeling, Model Transformation (Rules)
• Import
• Importing into Neo4j, Getting Data from RDBMS, Concrete Examples
• NEXT: Querying
• SQL to Cypher, Comparison, Example Queries, Hard in SQL -> Easy and Fast
in Cypher
Why are we doing this?
The Graph Advantage
Relational DBs Can’t Handle Relationships Well
• Cannot model or store data and relationships
without complexity
• Performance degrades with number and levels
of relationships, and database size
• Query complexity grows with need for JOINs
• Adding new types of data and relationships
requires schema redesign, increasing time to
market
… making traditional databases inappropriate
when data relationships are valuable in real-time
Slow development
Poor performance
Low scalability
Hard to maintain
Unlocking Value from Your Data Relationships
• Model your data naturally as a graph
of data and relationships
• Drive graph model from domain and
use-cases
• Use relationship information in real-
time to transform your business
• Add new relationships on the fly to
adapt to your changing requirements
High Query Performance with a Native Graph DB
• Relationships are first class citizen
• No need for joins, just follow pre-
materialized relationships of nodes
• Query & Data-locality – navigate out
from your starting points
• Only load what’s needed
• Aggregate and project results as you
go
• Optimized disk and memory model
for graphs
Importing into Neo4j
APIs, Tools, Tricks
Getting Data into Neo4j: CSV
Cypher-Based “LOAD CSV” Capability
• Transactional (ACID) writes
• Initial and incremental loads of up to
10 million nodes and relationships
• From HTTP and Files
• Power of Cypher
• Create and Update Graph Structures
• Data conversion, filtering, aggregation
• Destructuring of Input Data
• Transaction Size Control
• Also via Neo4j-Shell
CSV
10
M
Getting Data into Neo4j: CSV
Command-Line Bulk Loader neo4j-import
• For initial database population
• Scale across CPUs and disk performance
• Efficient RAM usage
• Split- and compressed file support
• For loads up to 10B+ records
• Up to 1M records per second
CSV
100
B
Getting Data into Neo4j: APIs
Custom Cypher-Based Loader
• Uses transactional Cypher http endpoint
• Parameterized, batched, concurrent
Cypher statements
• Any programming/script language with
driver or plain http requests
• Also for JSON and other formats
• Also available as JDBC Driver
Any
Data
Program
Program
Program
10
M
Getting Data into Neo4j: APIs
JVM Transactional Loader
• Use Neo4j’s Java-API
• From any JVM language, concurrent
• Fine grained TX Management
• Create Nodes and Relationships directly
• Also possible as Server extension
• Arbitrary data loading
Any
Data
Program
Program
Program
1B
Getting Data into Neo4j: API
Bulk Loader API
• Used by neo4j-import tool
• Create Streams of node and relationship
data
• Id-groups, id-handling & generation,
conversions
• Highly concurrent and memory efficient
• High performance CSV Parser, Decorators
CSV
100
B
Import Performance: Some Numbers
• Cypher Import 10k-10M records
• Import 100K-100M records per
second transactionally
• Bulk import tens of billions of records
in a few hours
Import Performance: Hardware Requirements
• Fast disk: SSD or SSD RAID
• Many Cores
• Medium amount of RAM (8-64G)
• Local Data Files, compress to save space
• High performance concurrent connection
to relational DB
• Linux, OSX works better than Windows
(FS-Handling)
• Disable Virus Scanners, Check Disk
Scheduler
Accessing Relational Data
Dump, Connect, Extract
Accessing Relational Data
• Dump to CSV all relational database have the
option to dump query results and tables to CSV
• Access with DB-Driver access DB with
JDBC/ODBC or other driver to pull out selected
datasets
• Use built-in or external endpoints some
databases expose HTTP-APIs or can be
integrated (DataClips)
• Use ETL-Tools existing ETL Tools can read from
relational and write to Neo4j e.g. via JDBC
Importing Your Data
Examples
Import Demo
Cypher-Based “LOAD CSV” Capability
• Use to import address data
Command-Line Bulk Loader neo4j-import
• Chicago Crime Dataset
Relational Import Tool neo4j-rdbms-import
• Proof of Concept
JDBC + API
CSV
LOAD CSV
Powerhorse of Graph ETL
Data Quality – Beware of Real World Data !
• Messy ! Don‘t trust the data
• Byte Order Mark
• Binary Zeros, non-text characters
• Inconsisent line breaks
• Header inconsistent with data
• Special character in non-quoted text
• Unexpected newlines in quoted and unquoted text-fields
• stray quotes
CSV – Power-Horse of Data Exchange
• Most Databases, ETL and Office-Tools
can read and write CSV
• Format only loosely specified
• Problems with quotes, newlines, charsets
• Some good checking tools (CSVKit)
Address Dataset
• Exported as large JOIN between
• City
• Zip
• Street
• Number
• Enterprise
• address.csv EntityNumber TypeOfAddress Zipcode MunicipalityNL StreetNL StreetFR HouseNr
200.065.765 REGO 9070 Destelbergen
Dendermon
desteenwe
g
Dendermonde
steenweg 430
200.068.636 REGO 9000 Gent Stropstraat Stropstraat 1
LOAD CSV
// create constraints
CREATE CONSTRAINT ON (c:City) ASSERT c.name IS UNIQUE;
CREATE CONSTRAINT ON (z:Zip) ASSERT z.name IS UNIQUE;
// manage tx
USING PERIODIC COMMIT 50000
// load csv row by row
LOAD CSV WITH HEADERS FROM "file:address.csv" AS csv
// transform values
WITH DISTINCT toUpper(csv.City) AS city, toUpper(csv.Zip) AS zip
// create nodes
MERGE (:City {name: city})
MERGE (:Zip {name: zip});
LOAD CSV
// manage tx
USING PERIODIC COMMIT 100000
// load csv row by row
LOAD CSV WITH HEADERS FROM "file:address.csv" AS csv
// transform values
WITH DISTINCT toUpper(csv.City) AS city, toUpper(csv.Zip) AS zip
// find nodes
MATCH (c:City {name: city}), (z:Zip {name: zip})
// create relationships
MERGE (c)-[:HAS_ZIP_CODE]->(z);
LOAD CSV Considerations
• Provide enough memory (heap & page-cache)
• Make sure your data is clean
• Create indexes and constraints upfront
• Use Labels for Matching
• DISTINCT, SKIP, LIMIT to control data volume
• Test with small batch
• Use PERIODIC COMMIT for larger volumes (> 20k)
• Beware of the EAGER Operation
• Will pull in all your CSV data
• Use EXPLAIN to detect it
Simplest LOAD CSV Example | Guide Import CSV | RDBMS ETL Guide
s
Demo
Mass Data Bulk Importer
neo4j-import --into graph.db
Neo4j Bulk Import Tool
• Memory efficient and scalable Bulk-Inserter
• Proven to work well for billions of records
• Easy to use, no memory configuration needed
CSV
Reference Manual: Import Tool
Chicago Crime Dataset
• City of Chicago, Crime Data since 2001
• Go to Website, download dataset
• Prepare Dataset, Cleanup
• Specify Headers (direct or separate file)
• ID-definition, data-types, labels, rel-types
• Import (30-50s)
• Use!
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2
http://markhneedham.com/blog?s=Chicago+Crime
Chicago Crime Dataset
• crimeTypes.csv
• Types of crimes
• beats.csv
• Police areas
• crimes.csv
• Crime description
• crimesBeats.csv
• In which beat did a crime happen
• crimesPrimaryTypes.csv
• Primary Type assignment
Chicago Crime Dataset
crimes.csv
:ID(Crime),id,:LABEL,date,description
8920441,8920441,Crime,12/07/2012 07:50:00 AM,AUTOMOBILE
primaryTypes.csv
:ID(PrimaryType),crimeType
ARSON,ARSON
crimesPrimaryTypes.csv
:START_ID(Crime),:END_ID(PrimaryType)
5221115,NARCOTICS
Chicago Crime Dataset
./neo/bin/neo4j-import
--into crimes.db
--nodes:CrimeType primaryTypes.csv
--nodes beats.csv
--nodes crimes_header.csv,crimes.csv
--relationships:CRIME_TYPE crimesPrimaryTypes.csv
--relationships crimesBeats.csv
s
Demo
Neo4j-RDBMS-Importer
Proof of Concept
s
Recap –
Transformation Rules
Normalized ER-Models: Transformation Rules
• Tables become nodes
• Table name as node-label
• Columns turn into properties
• Convert values if needed
• Foreign Keys (1:1, 1:n, n:1) into relationships,
column name into relationship-type (or better verb)
• JOIN-Tables represent relationships
• Also other tables without domain identity (w/o PK) and two FKs
• Columns turn into relationship properties
Normalized ER-Models: Cleanup Rules
• Remove technical IDs (auto-incrementing PKs)
• Keep domain IDs (e.g. ISBN)
• Add constraints for those
• Add indexes for lookup fields
• Adjust names for Label, REL_TYPE and propertyName
Note: currently no composite constraints and indexes
RDBMS Import Tool Demo – Proof of Concept
• JDBC for vendor-independent database connection
• SchemaCrawler to extract DB-Meta-Data
• Use Rules to drive graph model import
• Optional means to override default behavior
• Scales writes with Parallel Batch Importer API
• Reads tables concurrently for nodes & relationships
Demo: MySQL - Employee Demo Database
Source: github.com/jexp/neo4j-rdbms-import
Blog Post
Post
gres MySQ
L
Oracle
s
Demo
Architecture & Integration
“Polyglot Persistence”
MIGRATE
ALL DATA
MIGRATE
GRAPH DATA
DUPLICATE
GRAPH DATA
Non-graph data Graph data
Graph dataAll data
All data
Relational
Database
Graph
Database
Application
Application
Application
Three Ways to Migrate Data to Neo4j
Data Storage and
Business Rules Execution
Data Mining
and Aggregation
Neo4j Fits into Your Enterprise Environment
Application
Graph Database Cluster
Neo4j Neo4j Neo4j
Ad Hoc
Analysis
Bulk Analytic
Infrastructure
Graph Compute Engine
EDW …
Data
Scientist
End User
Databases
Relational
NoSQL
Hadoop
Next Steps
Community. Training. Support.
There Are Lots of Ways to Easily Learn Neo4j
Resources
Online
• Developer Site
neo4j.com/developer
• RDBMS to Graph
• Guide: ETL from RDBMS
• Guide: CSV Import
• LOAD CSV Webinar
• Reference Manual
• StackOverflow
Offline
• In Browser Guide „Northwind“
• Import Training Classes
• Office Hours
• Professional Services Workshop
• Free Books:
• Graph Databases 2nd Edition
• Learning Neo4j
Register today at graphconnect.com
Early Bird only $99
Relational to Graph
Data Import
Thank you !
Questions ?
neo4j.com | @neo4j

More Related Content

What's hot

Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
DataWorks Summit/Hadoop Summit
 

What's hot (20)

Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
[243]kaleido 노현걸
[243]kaleido 노현걸[243]kaleido 노현걸
[243]kaleido 노현걸
 
Master Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache KafkaMaster Real-Time Streams With Neo4j and Apache Kafka
Master Real-Time Streams With Neo4j and Apache Kafka
 
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptxEncrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
 
지식그래프 개념과 활용방안 (Knowledge Graph - Introduction and Use Cases)
지식그래프 개념과 활용방안 (Knowledge Graph - Introduction and Use Cases)지식그래프 개념과 활용방안 (Knowledge Graph - Introduction and Use Cases)
지식그래프 개념과 활용방안 (Knowledge Graph - Introduction and Use Cases)
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
스타트업 나홀로 데이터 엔지니어: 데이터 분석 환경 구축기 - 천지은 (Tappytoon) :: AWS Community Day Onlin...
스타트업 나홀로 데이터 엔지니어: 데이터 분석 환경 구축기 - 천지은 (Tappytoon) :: AWS Community Day Onlin...스타트업 나홀로 데이터 엔지니어: 데이터 분석 환경 구축기 - 천지은 (Tappytoon) :: AWS Community Day Onlin...
스타트업 나홀로 데이터 엔지니어: 데이터 분석 환경 구축기 - 천지은 (Tappytoon) :: AWS Community Day Onlin...
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and Modeling
 
Sync async-blocking-nonblocking-io
Sync async-blocking-nonblocking-ioSync async-blocking-nonblocking-io
Sync async-blocking-nonblocking-io
 
[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견
 
BigQuery의 모든 것(기획자, 마케터, 신입 데이터 분석가를 위한) 입문편
BigQuery의 모든 것(기획자, 마케터, 신입 데이터 분석가를 위한) 입문편BigQuery의 모든 것(기획자, 마케터, 신입 데이터 분석가를 위한) 입문편
BigQuery의 모든 것(기획자, 마케터, 신입 데이터 분석가를 위한) 입문편
 
Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
openCypher: Introducing subqueries
openCypher: Introducing subqueriesopenCypher: Introducing subqueries
openCypher: Introducing subqueries
 
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?
[KAIST 채용설명회] 데이터 엔지니어는 무슨 일을 하나요?
 

Viewers also liked

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
InfiniteGraph
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDays
Neo4j
 

Viewers also liked (16)

Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph Databases
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational Databases
 
Graph databases
Graph databasesGraph databases
Graph databases
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBay
 
Lju Lazarevic
Lju LazarevicLju Lazarevic
Lju Lazarevic
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Graph Database, a little connected tour - Castano
Graph Database, a little connected tour - CastanoGraph Database, a little connected tour - Castano
Graph Database, a little connected tour - Castano
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDays
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 

Similar to Relational to Graph - Import

MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FS
MongoDB
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
Neo4j
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
Neo4j
 

Similar to Relational to Graph - Import (20)

Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .Net
 
Graph databases for SQL Server profesionnals
Graph databases for SQL Server profesionnalsGraph databases for SQL Server profesionnals
Graph databases for SQL Server profesionnals
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
PI-RDBMS.ppt
PI-RDBMS.pptPI-RDBMS.ppt
PI-RDBMS.ppt
 
MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FS
 
Data Stream Processing for Beginners with Kafka and CDC
Data Stream Processing for Beginners with Kafka and CDCData Stream Processing for Beginners with Kafka and CDC
Data Stream Processing for Beginners with Kafka and CDC
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
CDC to the Max!
CDC to the Max!CDC to the Max!
CDC to the Max!
 
Informatica slides
Informatica slidesInformatica slides
Informatica slides
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
No sql Database
No sql DatabaseNo sql Database
No sql Database
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational Controls
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
 
There and Back Again, A Developer's Tale
There and Back Again, A Developer's TaleThere and Back Again, A Developer's Tale
There and Back Again, A Developer's Tale
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 

More from Neo4j

More from Neo4j (20)

GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanWorkshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with GraphGraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
GraphSummit Milan & Stockholm - Neo4j: The Art of the Possible with Graph
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

Recently uploaded

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Domenico Conte
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 

Recently uploaded (20)

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 

Relational to Graph - Import

  • 1. Relational to Graph Importing Data into Neo4j June 2015 Michael Hunger michael@neo4j.org |@mesirii
  • 2. Agenda • Review Webinar Series • Importing Data into Neo4j • Getting Data from RDBMS • Concrete Examples • Demo • Q&A
  • 4. Webinar Review – Relational to Graph • Introduction and Overview • Introduction of Neo4j, Solving RDBMS Issues, Northwind Demo • Modeling Concerns • Modeling in Graphs and RDBMS, Good Modeling Practices, • Model first, incremental Modeling, Model Transformation (Rules) • Import • Importing into Neo4j, Getting Data from RDBMS, Concrete Examples • NEXT: Querying • SQL to Cypher, Comparison, Example Queries, Hard in SQL -> Easy and Fast in Cypher
  • 5. Why are we doing this? The Graph Advantage
  • 6. Relational DBs Can’t Handle Relationships Well • Cannot model or store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market … making traditional databases inappropriate when data relationships are valuable in real-time Slow development Poor performance Low scalability Hard to maintain
  • 7. Unlocking Value from Your Data Relationships • Model your data naturally as a graph of data and relationships • Drive graph model from domain and use-cases • Use relationship information in real- time to transform your business • Add new relationships on the fly to adapt to your changing requirements
  • 8. High Query Performance with a Native Graph DB • Relationships are first class citizen • No need for joins, just follow pre- materialized relationships of nodes • Query & Data-locality – navigate out from your starting points • Only load what’s needed • Aggregate and project results as you go • Optimized disk and memory model for graphs
  • 10. Getting Data into Neo4j: CSV Cypher-Based “LOAD CSV” Capability • Transactional (ACID) writes • Initial and incremental loads of up to 10 million nodes and relationships • From HTTP and Files • Power of Cypher • Create and Update Graph Structures • Data conversion, filtering, aggregation • Destructuring of Input Data • Transaction Size Control • Also via Neo4j-Shell CSV 10 M
  • 11. Getting Data into Neo4j: CSV Command-Line Bulk Loader neo4j-import • For initial database population • Scale across CPUs and disk performance • Efficient RAM usage • Split- and compressed file support • For loads up to 10B+ records • Up to 1M records per second CSV 100 B
  • 12. Getting Data into Neo4j: APIs Custom Cypher-Based Loader • Uses transactional Cypher http endpoint • Parameterized, batched, concurrent Cypher statements • Any programming/script language with driver or plain http requests • Also for JSON and other formats • Also available as JDBC Driver Any Data Program Program Program 10 M
  • 13. Getting Data into Neo4j: APIs JVM Transactional Loader • Use Neo4j’s Java-API • From any JVM language, concurrent • Fine grained TX Management • Create Nodes and Relationships directly • Also possible as Server extension • Arbitrary data loading Any Data Program Program Program 1B
  • 14. Getting Data into Neo4j: API Bulk Loader API • Used by neo4j-import tool • Create Streams of node and relationship data • Id-groups, id-handling & generation, conversions • Highly concurrent and memory efficient • High performance CSV Parser, Decorators CSV 100 B
  • 15. Import Performance: Some Numbers • Cypher Import 10k-10M records • Import 100K-100M records per second transactionally • Bulk import tens of billions of records in a few hours
  • 16. Import Performance: Hardware Requirements • Fast disk: SSD or SSD RAID • Many Cores • Medium amount of RAM (8-64G) • Local Data Files, compress to save space • High performance concurrent connection to relational DB • Linux, OSX works better than Windows (FS-Handling) • Disable Virus Scanners, Check Disk Scheduler
  • 18. Accessing Relational Data • Dump to CSV all relational database have the option to dump query results and tables to CSV • Access with DB-Driver access DB with JDBC/ODBC or other driver to pull out selected datasets • Use built-in or external endpoints some databases expose HTTP-APIs or can be integrated (DataClips) • Use ETL-Tools existing ETL Tools can read from relational and write to Neo4j e.g. via JDBC
  • 20. Import Demo Cypher-Based “LOAD CSV” Capability • Use to import address data Command-Line Bulk Loader neo4j-import • Chicago Crime Dataset Relational Import Tool neo4j-rdbms-import • Proof of Concept JDBC + API CSV
  • 22. Data Quality – Beware of Real World Data ! • Messy ! Don‘t trust the data • Byte Order Mark • Binary Zeros, non-text characters • Inconsisent line breaks • Header inconsistent with data • Special character in non-quoted text • Unexpected newlines in quoted and unquoted text-fields • stray quotes
  • 23. CSV – Power-Horse of Data Exchange • Most Databases, ETL and Office-Tools can read and write CSV • Format only loosely specified • Problems with quotes, newlines, charsets • Some good checking tools (CSVKit)
  • 24. Address Dataset • Exported as large JOIN between • City • Zip • Street • Number • Enterprise • address.csv EntityNumber TypeOfAddress Zipcode MunicipalityNL StreetNL StreetFR HouseNr 200.065.765 REGO 9070 Destelbergen Dendermon desteenwe g Dendermonde steenweg 430 200.068.636 REGO 9000 Gent Stropstraat Stropstraat 1
  • 25. LOAD CSV // create constraints CREATE CONSTRAINT ON (c:City) ASSERT c.name IS UNIQUE; CREATE CONSTRAINT ON (z:Zip) ASSERT z.name IS UNIQUE; // manage tx USING PERIODIC COMMIT 50000 // load csv row by row LOAD CSV WITH HEADERS FROM "file:address.csv" AS csv // transform values WITH DISTINCT toUpper(csv.City) AS city, toUpper(csv.Zip) AS zip // create nodes MERGE (:City {name: city}) MERGE (:Zip {name: zip});
  • 26. LOAD CSV // manage tx USING PERIODIC COMMIT 100000 // load csv row by row LOAD CSV WITH HEADERS FROM "file:address.csv" AS csv // transform values WITH DISTINCT toUpper(csv.City) AS city, toUpper(csv.Zip) AS zip // find nodes MATCH (c:City {name: city}), (z:Zip {name: zip}) // create relationships MERGE (c)-[:HAS_ZIP_CODE]->(z);
  • 27. LOAD CSV Considerations • Provide enough memory (heap & page-cache) • Make sure your data is clean • Create indexes and constraints upfront • Use Labels for Matching • DISTINCT, SKIP, LIMIT to control data volume • Test with small batch • Use PERIODIC COMMIT for larger volumes (> 20k) • Beware of the EAGER Operation • Will pull in all your CSV data • Use EXPLAIN to detect it Simplest LOAD CSV Example | Guide Import CSV | RDBMS ETL Guide
  • 29. Mass Data Bulk Importer neo4j-import --into graph.db
  • 30. Neo4j Bulk Import Tool • Memory efficient and scalable Bulk-Inserter • Proven to work well for billions of records • Easy to use, no memory configuration needed CSV Reference Manual: Import Tool
  • 31. Chicago Crime Dataset • City of Chicago, Crime Data since 2001 • Go to Website, download dataset • Prepare Dataset, Cleanup • Specify Headers (direct or separate file) • ID-definition, data-types, labels, rel-types • Import (30-50s) • Use! https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2 http://markhneedham.com/blog?s=Chicago+Crime
  • 32. Chicago Crime Dataset • crimeTypes.csv • Types of crimes • beats.csv • Police areas • crimes.csv • Crime description • crimesBeats.csv • In which beat did a crime happen • crimesPrimaryTypes.csv • Primary Type assignment
  • 33. Chicago Crime Dataset crimes.csv :ID(Crime),id,:LABEL,date,description 8920441,8920441,Crime,12/07/2012 07:50:00 AM,AUTOMOBILE primaryTypes.csv :ID(PrimaryType),crimeType ARSON,ARSON crimesPrimaryTypes.csv :START_ID(Crime),:END_ID(PrimaryType) 5221115,NARCOTICS
  • 34. Chicago Crime Dataset ./neo/bin/neo4j-import --into crimes.db --nodes:CrimeType primaryTypes.csv --nodes beats.csv --nodes crimes_header.csv,crimes.csv --relationships:CRIME_TYPE crimesPrimaryTypes.csv --relationships crimesBeats.csv
  • 38. Normalized ER-Models: Transformation Rules • Tables become nodes • Table name as node-label • Columns turn into properties • Convert values if needed • Foreign Keys (1:1, 1:n, n:1) into relationships, column name into relationship-type (or better verb) • JOIN-Tables represent relationships • Also other tables without domain identity (w/o PK) and two FKs • Columns turn into relationship properties
  • 39. Normalized ER-Models: Cleanup Rules • Remove technical IDs (auto-incrementing PKs) • Keep domain IDs (e.g. ISBN) • Add constraints for those • Add indexes for lookup fields • Adjust names for Label, REL_TYPE and propertyName Note: currently no composite constraints and indexes
  • 40. RDBMS Import Tool Demo – Proof of Concept • JDBC for vendor-independent database connection • SchemaCrawler to extract DB-Meta-Data • Use Rules to drive graph model import • Optional means to override default behavior • Scales writes with Parallel Batch Importer API • Reads tables concurrently for nodes & relationships Demo: MySQL - Employee Demo Database Source: github.com/jexp/neo4j-rdbms-import Blog Post Post gres MySQ L Oracle
  • 43. MIGRATE ALL DATA MIGRATE GRAPH DATA DUPLICATE GRAPH DATA Non-graph data Graph data Graph dataAll data All data Relational Database Graph Database Application Application Application Three Ways to Migrate Data to Neo4j
  • 44. Data Storage and Business Rules Execution Data Mining and Aggregation Neo4j Fits into Your Enterprise Environment Application Graph Database Cluster Neo4j Neo4j Neo4j Ad Hoc Analysis Bulk Analytic Infrastructure Graph Compute Engine EDW … Data Scientist End User Databases Relational NoSQL Hadoop
  • 46. There Are Lots of Ways to Easily Learn Neo4j
  • 47. Resources Online • Developer Site neo4j.com/developer • RDBMS to Graph • Guide: ETL from RDBMS • Guide: CSV Import • LOAD CSV Webinar • Reference Manual • StackOverflow Offline • In Browser Guide „Northwind“ • Import Training Classes • Office Hours • Professional Services Workshop • Free Books: • Graph Databases 2nd Edition • Learning Neo4j
  • 48. Register today at graphconnect.com Early Bird only $99
  • 49. Relational to Graph Data Import Thank you ! Questions ? neo4j.com | @neo4j

Editor's Notes

  1. Presenter Notes - Challenges with current technologies? Database options are not suited to model or store data as a network of relationships Performance degrades with number and levels of relationships making it harder to use for real-time applications Not flexible to add or change relationships in realtime
  2. Presenter Notes - How does one take advantage of data relationships for real-time applications? To take advantage of relationships Data needs to be available as a network of connections (or as a graph) Real-time access to relationship information should be available regardless of the size of data set or number and complexity of relationships The graph should be able to accommodate new relationships or modify existing ones
  3. Presenter Notes - How does one take advantage of data relationships for real-time applications? To take advantage of relationships Data needs to be available as a network of connections (or as a graph) Real-time access to relationship information should be available regardless of the size of data set or number and complexity of relationships The graph should be able to accommodate new relationships or modify existing ones
  4. Presenter Notes - How does one take advantage of data relationships for real-time applications? To take advantage of relationships Data needs to be available as a network of connections (or as a graph) Real-time access to relationship information should be available regardless of the size of data set or number and complexity of relationships The graph should be able to accommodate new relationships or modify existing ones
  5. Presenter Notes - How does one take advantage of data relationships for real-time applications? To take advantage of relationships Data needs to be available as a network of connections (or as a graph) Real-time access to relationship information should be available regardless of the size of data set or number and complexity of relationships The graph should be able to accommodate new relationships or modify existing ones
  6. Presenter Notes - Challenges with current technologies? Database options are not suited to model or store data as a network of relationships Performance degrades with number and levels of relationships making it harder to use for real-time applications Not flexible to add or change relationships in realtime
  7. In the near future, many of your apps will be driven by data relationships and not transactions You can unlock value from business relationships with Neo4j