Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
GraphTalk Copenhagen - Introduction to Graphs and Neo4j
1. Welcome to
Fredrik Johansson, fredrik.johansson@neo4j.com
Dinuke Abeysekera, dinuke.abeysekera@neo4j.com
Stefan Wendin, stefan.wendin@neo4j.com
2. 10:00 - 12:00 - Presentations
• Introduction to the Neo4j Graph Platform
Fredrik Johansson & Dinuke Abeysekera, Neo4j
• Killing Data Silos in the Life Sciences with Neo4j
Dave Iberson-Hurst, S-cubed
• Fraud Detection with Graphs
Marius Hartmann, Danish Business Authority
• Accelerate Innovation through Graph Thinking
Stefan Wendin, Neo4j
12:00 - Q&A & Networking
Agenda
3. Neo4j - The Graph Company
500+
7/10
12/25
8/10
53K+
100+
250+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
•Creator of the Neo4j Graph Platform
•~300 employees
•HQ in Silicon Valley, other offices include
London, Munich, Paris and Malmö
(Sweden)
•$80M in funding from Fidelity, Sunstone,
Conor, Creandum, and Greenbridge
Capital
•Over 10M+ downloads,
•300+ enterprise subscription customers
with over half with >$1B in revenue
Ecosystem
Startups in program
Enterprise customers
Partners
Meet up members
Events per year
Industry’s Largest Dedicated Investment in Graphs
6. What Is Different In Neo4j?
6
TRADITIONAL
DATABASES
Store and retrieve data
Real time storage & retrieval
Up to
3
Max #
of
hops
7. What Is Different In Neo4j?
7
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data
Real time storage & retrieval
Long running queries
Aggregation & filtering
Up to
3
Max #
of
hops
1
8. What Is Different In Neo4j?
8
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data Connections in data
Real time storage & retrieval Real-Time Connected Insights
Long running queries
Aggregation & filtering
“Our Neo4j solution is literally thousands of times faster
than the prior MySQL solution, with queries that require
10-100 times less code”
Volker Pacher, Senior Developer
Up to
3
Max #
of
hops
1 Millions
10. Connectedness and Size of Data Set
ResponseTime
Relational and
Other NoSQL
Databases
0 to 2 hops
0 to 3 degrees
Thousands of connections
1000x
Advantage
Tens to hundreds of hops
Thousands of degrees
Billions of connections
Neo4j
“Minutes to
milliseconds”
What’s Different in Neo4j:
“Minutes to Milliseconds” Real-Time Query Performance
11. ACID Consistency Non ‘Graph-ACID’ DBMSs
11
Maintains Integrity Over Time
Guaranteed Graph Consistency
Becomes Corrupt Over Time
Not ‘Good Enough’ for Graphs
What Is Different In Neo4j?
ACID Graph Writes : A Requirement for Graph Transactions
12. What Is Different In Neo4j?
Cypher Query Language
12
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total
Project
Impact
Less time writing queries
• More time understanding the answers
• Leaving time to ask the next question
Less time debugging queries:
• More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:
• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
13. 13
Neo4j Graph Advantage: Foundational Components
1
2
3
4
5
6
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency, Type Safety
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
14. Neo4j Enterprise Maturity & Robustness
14
Neo4j Security Foundation Multi-Clustering Support for
Global Internet Apps
Rolling Upgrades
Schema Constraints Concurrent/Transactional Write
Performance
Auto Cache Reheating
For Restarts, Restores and Cluster
Expansion
Neo4j 3.4 now supports
rolling upgrades
3.4 3.5
Upgrade older instances while keeping other
members stable and without requiring a restart
of the environment
3.5
15. Neo4j: Enabling the Connected Enterprise
Consumers of Connected Data
15
AI & Graph Analytics
• Sentiment analysis
• Customer
segmentation
• Machine learning
• Cognitive computing
• Community detection
Transactional Graphs
• Fraud detection
• Real-time recommendations
• Network and IT operations
management
• Knowledge Graphs
• Master Data Management
Discovery & Visualization
• Fraud detection
• Network and IT
operations
• Product information
management
• Risk and portfolio analysisData
Scientists
Business
Users
Applications
16. Neo4j Graph Platform
16
Development &
Administration
Analytics
Tooling
BUSINESS USERS
DEVELOPERS
ADMINS
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & Visualization
DATA
ANALYSTS
DATA
SCIENTISTS
Drivers & APIs
APPLICATIONS
AI
openCypherCloud
17. Neo4j Graph Platform: Where We Are Today
17
Development &
Administration
Analytics
Tooling
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & VisualizationDrivers & APIs
AI
Improved Admin Experience
- Rolling upgrades
- Brute force attack prevention
- Fast, resumable backups
- Cache Warming on startup
- Improved diagnostics
Multi-Cluster routing built into Bolt drivers
Seabolt & Go Driver
- Other v1.7 Supported Drivers: Java, JavaScript, Python, .NET
- Community Drivers: Perl, PhP, Ruby, Erlang, R, Haskell, Clojure, JDBC and many others
SparkCypher/Morpheus
(pre-EAP)
Spark Implementation Proposal
for getting Cypher into Spark
Neo4j Bloom
- New graph illustration
and communication
tool for non-technical
users
- Explore and edit graph
- Search-based
- Create storyboards
- Foundation for graph
data discovery
- Integrated with graph
platform
Graph Data Science
High speed graph algorithms
Neo4j Database 3.4 & 3.5
- 70% faster Cypher
- Native GraphB+Tree Indexes
(up to 5x faster writes)
- Full-text search
- Index-Backed Optimisation
- 100B+ bulk importer
- Date/Time data type
- 3-D Geospatial search
- Secure, Horizontal Multi-Clustering
- Property Blacklisting
- Causal Cluster with Raft v2 Protocol
- Hostname verification, Intra-cluster discovery encryption
18. The information presented here is Neo4j, Inc. confidential and does not
constitute, and should not be construed as, a promise or commitment by Neo4j
to develop, market or deliver any particular product, feature or function.
Neo4j reserves the right to change its product plans or roadmap at any time,
without obligation to notify any person of such changes.
The timing and content of Neo4j’s future product releases could differ materially
from the expectations discussed herein.
Safe Harbor Roadmap Disclaimer
18
19. Neo4j 4.0 Milestone Release 2 is Out!
19
• New index population algorithm
• Increased index key size for the
native index provider
• Transactional ID Management
• Improved space reuse in store files
• Improved Cluster performance
• New Spring Boot Starter
• SDN/RX
• Support for multiple databases
• Reactive drivers with back-pressure
and flow control
• Schema-based security model
• Role and user management
• System database
• neo4j:// scheme
• For standalone, Causal Cluster and desktop installations
• Download MR2 from https://neo4j.com/download-center/#prerelease
• Windows ZIP, Generic tarball, Docker image, Debian and RedHat packages
• Documentation here
• Features:
20. 20
Graph Visualization Options for Neo4j
Neo4j Bloom
Provided by Neo4j
Exclusively optimized for Neo4j
graphs
Deploys easily in Neo4j Desktop and
also as web based
Focused on graph exploration thru a
code-free UI
Near natural language search
Currently caters to data analysts and
graph SMEs
Currently for individual or small
team use
Viz Toolkits
3rd party e.g. vis.js, d3.js, Keylines
Some offer data hooks into Neo4j,
others may require custom integration
Offer robust APIs for flexible control
of the viz output
Cater to developers who will create a
custom solution, usually with limited
interactivity
Departmental, enterprise or public
use
BI Tools
3rd party e.g. Tableau, Qlik
Not optimized for graph data, may
require a special connector
UI for dashboard and report creation
with many kinds of viz, in addition to
graph viz
Cater to business users and data
analysts
Departmental, cross- department or
enterprise use
Graph Viz Solutions
3rd party e.g. Linkurious, Tom
Sawyer
Have to support multiple graph
models and sources
Feature UI for exploration or APIs
for customizing output and
embedding/publishing
Solutions may cater to business
users, analysts or developers
Small team, departmental or
cross-department use
Little technical expertise Most technically involved
Exploration focused Publishing / Consumption focused
Smaller deployments Larger deployments
21. Perspective
Search
Visualization
Exploration
Inspection
Editing
21
Business view of the graph
Departmental views • Hiding PII • Styling
Near-natural Language Search
Full-text search • Graph patterns
• Custom Search Phrases
GPU Accelerated Visualization
High performance
physics & rendering
Direct graph interactions
Select, expand, dismiss, find paths
Node + Relationship details
Browse from neighbor to neighbor
Create, Connect, Update
Code-free graph changes
Neo4j Bloom
Overview
22. 22
Neo4j Graph Algorithm Library
Finds the optimal path
or evaluates route
availability and quality
Pathfinding
& Search
Determines the
importance of distinct
nodes in the network
Centrality
Evaluates how a group
is clustered or
partitioned
Community
Detection