GraphTalk
Milan
What is the most
powerful database in the
world?
Tables?
What is a Graph
Graph Theory
Meet
Leonhard Euler
• Swiss mathematician
• Inventor of Graph
Theory (1736)
Königsberg (Prussia) - 1736
A
B
D
C
A
B
D
C
1
2
3
4
7
6
5
A
B
D
C
1
2
3
4
7
6
5
A Graph Is
ROAD
TRAFFIC
LIGHTS
A Graph Is
HAS
HOTEL
ROOMS
AVAILABLE
A Graph Is
KNOWS
KNOWS
WORKS_AT
WORKS_AT
WORKS_AT
COMPANY
STANFORD
STUDIED_AT
NEO
COLUMBIA
STUDIED_AT NAME:ANNE
A Graph
RELATIONSHIPS
NODE
PROPERTY
Labelled Property Graph Model
A way of representing data
DATA DATA
Relational
Database
Good for:
Well-understood data structures
that don’t change too frequently
A way of representing data
Known problems involving discrete
parts of the data, or minimal
connectivity
Graph
Database
Relational
Database
Good for:
Well-understood data structures
that don’t change too frequently
Known problems involving discrete
parts of the data, or minimal
connectivity
A way of representing data
Good for:
Dynamic systems: where the data
topology is difficult to predict
Dynamic requirements:
the evolve with the business
Problems where the relationships in
data contribute meaning & value
Queries can take non-sequential,
arbitrary paths through data
Real-time queries need speed and
consistent response times
Queries must run reliably
with consistent results
Q
A single query can
touch a lot of data
Relationship Queries Strain Traditional Databases
24
is a highly scalable, native graph database.
Neo4j gives any organization the ability to leverage
connections in data — in real-time
to create value
Our core belief is — connections between data
are as important as the data itself
Use of data connections has created industry leaders
Today, as processes get digitized and
interconnected - we see the emergence of a
“connected enterprise”
PAY-
MENTS
SALES-
CHANN
ELS
SUPPLY
CHAIN
PRO-
DUCTS
MARKE-
TING
CRM
PAY-
MENTS
SALES-
CHANN
ELS
SUPPLY
CHAIN
PRO-
DUCTS
MARKE-
TING
CRM
SALES-
CHANN
ELS
Store
Mobile
Webstore
BEYON
D
MOBILE
Augmented Reality
Smart products
Connected homes
CONNECTED
ENTERPRISE
SUPPLY
CHAIN
PRODU
CTS
Shipping
Inventory
Express goods
Home delivery Ratings
Price-range
Category
Credit Card
Cash
Mobile Pay
Purchase History
Returns
reviews
Tweet
s
Emails
Customer support
Content
Promotions
Online advertising
Loyalty Programs
Feedback
CONSUME
R
DATA
PRODUCT
DATA
PAYMENT
DATA
SOCIAL
DATA
SUPPLIER
DATA
“The next wave of competitive advantage will be
all about using connections to produce
actionable insights.”
Real-Time Recommendations
Dynamic Pricing
Actionable insights power new use cases
Artificial Intelligence
& IoT-applications
Fraud Detection Network ManagementCustomer Engagement
Supply Chain
Efficiency
Identity and Access
Management
Neo4j
Advantage
Neo4j Property Graph
The Whiteboard Model is the Physical Model
35
A unified view for
ultimate agility
• Easily understood
• Easily evolved
• Easy collaboration
between business
and IT
Equivalent Cypher Query
MATCH (you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco)
WHERE id(you)={id}
RETURN reco
Traversal Speeds on Amazon Retail Dataset
Threads Hops per second
1 3-4 million
10 17-29 million
20 34-50 million
30 36-60 million
36
Social Recommendation Example
Neo4j Advantage - Performance
Neo4j: Native Graph from the Start
Native graph storage
Optimized for real-time reads and ACID writes
• Relationships stored as physical objects,
eliminating need for joins and join tables
• Nodes connected at write time, enabling scale-
independent response times
Native graph querying
Memory structures and algorithms optimized for graphs
• Index-free adjacency enables 1M+ hops per second via in-
memory pointer chasing
• Off-heap page cache improves operational robustness
and scaling compared with JVM-based caches
• “Minutes to milliseconds” performance improvement
Neo4j Advantage - Performance Neo4j Advantage - ACID Transactions
Neo4j: Built for the Enterprise
Native Graph Storage
Designed, built, and tested for graphs
Native Graph Query Processing
For real-time, relationship-based apps
Evaluate millions of relationships in a blink
Whiteboard-Friendly Data Modeling
Faster projects compared to RDBMS
ACID Transactions and Security
Fully ACID transactions, causal consistency and
enterprise security
Powerful, Expressive Query Language
Improved productivity, with 10x to 100x less
code than SQL
Causal Clustering Architecture
Architecture provides ideal balance of
performance, availability, scale for graphs
Built-in Data Import
Seamless import from other databases
Drivers for Popular Language and Platforms
Fits easily into your IT environment, with
drivers and APIs for popular languages
MATCH
(A)38
Neo4j: Right for the Enterprise
ACID Transactions
• ACID transactions with causal consistency
• Neo4j Security Foundation delivers
enterprise-class security and control
Hardware Efficiency
• Native graph query processing and storage
requires 10x less hardware
• Index-free adjacency requires 10x less CPU
Agility
• Native property graph model
• Modify schema as business changes
without disrupting existing data
Developer Productivity
• Easy to learn, declarative openCypher
graph query language
• Procedural language extensions
• Open library of procedures and functions
APOC
• Neo4j support and training
• Worldwide developer community
… all backed by Neo’s track record of
leadership and product roadmap
Performance
• Index-free adjacency delivers millions of
hops per second
• In-memory pointer chasing for fast query
results
Neo4j in
Action
Real-time Package Routing
• Large postal service with over
500k employees
• Neo4j routes 7M+ packages daily
at peak, with peaks of 5,000+
routing operations per second.
Real-time promotion recommendations
• Record “Cyber Monday” sales
• About 35M daily transactions
• Each transaction is 3-22 hops
• Queries executed in 4ms or less
• Replaced IBM Websphere commerce
Real-time pricing engine
• 300M pricing operations per day
• 10x transaction throughput on half
the hardware compared to Oracle
• Presentation at
http://graphconnect.com/gc2016-sf/
• Replaced Oracle database
Routing Recommendations
Don’t Take Our Word For It
Examples of companies that use Neo4j, the world’s leading graph
database, for recommendation and personalization engines.
Adidas uses Neo4j to combine
content and product data into a
single, searchable graph database
which is used to create a
personalized customer experience
“We have many different silos, many
different data domains, and in order
to make sense out of our data, we
needed to bring those together and
make them useful for us,”
– Sokratis Kartelias, Adidas
eBay Now Tackles eCommerce
Delivery Service Routing with Neo4j
“We needed to rebuild when growth
and new features made our slowest
query longer than our fastest delivery
- 15 minutes! Neo4j gave us best
solution”
– Volker Pacher, eBay
Walmart uses Neo4j to give
customer best web experience
through relevant and personal
recommendations
“As the current market leader in
graph databases, and with enterprise
features for scalability and
availability, Neo4j is the right choice
to meet our demands”.
- Marcos Vada, Walmart
Product recommendations Personalization
Linkedin Chitu seeks to engage
Chinese jobseekers through a
game-like user interface that is
available on both desktop and
mobile devices.
“The challenge was speed,” said
Dong Bin, Manager of
Development at Chitu. “Due to the
rate of growth we saw from our
competitors in the Chinese market,
we knew that we had to launch
Chitu as quickly as possible.”
Social Network
Background
• Large global bank
• Deploying Reference Data to users and systems
• 12 data domains, 18 datasets, 400+ integrations
• Complex data management infrastructure
Business Problem
• Master data silos were inflexible and hard to
consume
• Needed simplification to reduce redundancy
• Reduce risk when data is in consumers’ hands
• Dramatically improve efficiency
Solution and Benefits
• Data distribution flows improved dramatically
• Knowledge Base improves consumer access
• Ad-hoc analytics improved
• Governance, lineage and trust improved
• Better service level from IT to data consumers
UBS FINANCIAL SERVICES
Master Data Management / Metadata42
CE Customer since 2016
Q1
EE Customer since
2015
Background
• Brazil's largest bank, #38 on Forbes G2000
• $61B annual sales 95K employees
• Most valuable brand in Brazil
• 28.9M credit card & 25.6M debit card accounts
• High integrity, customer-centric values
Business Problem
• Data silos made assessing credit worthiness hard
• High sensitivity to fraud activity
• 73% of all transactions over internet and mobile
• Needed real-time detection for 2,000 analysts
• Scale to trillions of relationships
Solution and Benefits
• Credit monitoring and fraud detection application
• 4.2M nodes & 4B relationships for 100 analysts
• Grow to 93T relationships for 2000 analysts by 2021
• Real time visibility into money flow across multiple
customers
Itau Unibanco FINANCIAL SERVICES
Fraud Detection / Credit Monitoring43
CE Customer since 2016
Q1
EE Customer since Q2
2017
Background
• Mid-size German insurer founded in 1858
• Project executed by Delvin, a subsidiary
of die Bayerische Versicherung and an IT
insurance specialist
Business Problem
• Field sales needed easy, dynamic, 24/7 access
to policies and customer data
• Existing DB2 system unable to meet
performance and scaling demands
• Needed 24/7 available system for sales unit
outside the company
Solution and Benefits
• Enabled flexible searching of policies and
associated personal data
• Raised the bar on industry practices
• Delivered high performance and scalability
• Ported existing metadata easily
die Bayerishe INSURANCE
Master Data Management44
Business Problem
• Find relationships between people, accounts,
shell companies and offshore accounts
• Journalists are non-technical
• Biggest “Snowden-Style” document leak ever;
11.5 million documents, 2.6TB of data
Solution and Benefits
• Pulitzer Prize winning investigation resulted in
robust coverage of fraud and corruption
• PM of Iceland resigned, exposed Putin, Prime
Ministers, gangsters, celebrities (Messi)
• Trials ongoing
Background
• International Consortium of Investigative
Journalists (ICIJ), small team of data journalists
• International investigative team specializing in
cross-border crime, corruption and accountability
of power
• Works regularly with leaks and large datasets
ICIJ Panama Papers INVESTIGATIVE JOURNALISM
Fraud Detection / Graph-Based Search45
Background
• Global financial services firm with trillions of
dollars in assets
• Varying compliance and governance
considerations
• Incredibly complex transaction systems, with
ever-growing opportunities for fraud
Business Problem
• Needed to spot and prevent fraud detection in
real time, especially in payments that fall within
“normal” behavior metrics
• Needed more accurate and faster credit risk
analysis for payment transactions
• Needed to dramatically reduce chargebacks
Solution and Benefits
• Lowered TCO by simplifying credit risk analysis and
fraud detection processes
• Identify entities and connections uniquely
• Saved billions by reducing chargebacks and fraud
• Enabled building real-time apps with non-uniform
data and no sparse tables or schema changes
London and New York Financial FINANCIAL SERVICES
Fraud Detection
s
46
Background
• One of the world’s oldest and largest banks
• 100+ year-old bank with more than 1000
predecessor institutions
• 500,000 employees and contractors
• Needed to manage and visualize ~50,000 Unix
servers in its network
Business Problem
• Original RDBMS solution could handle only
5,000 servers
• Improve net performance company-wide
• Leverage M&A legacy systems with no room
for error
Solution and Benefits
• Store UNIX server and network config in Neo4j
• Combine Splunk log data into an application
that visualizes events on the network
• Neo4j vastly improved app performance
• New apps built much faster with Neo4j than SQL
Large Investment Bank FINANCIAL SERVICES
Network and IT Operations47
What the analysts say
“Neo4j is the clear
leader in the property
graph space”
Forrester Research has
named Neo4j “Most Popular
Graph Database”
Part of Gartner’s “Operational
Database” Magic Quadrant
since 2014
The Neo4j ecosystem
Over 250
customers
400+ yearly
events and
meetups
100k+ users
and
developers
400+
Startups
Please get in touch
Bill Brooks
bill@neo4j.com

Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j

  • 1.
  • 2.
    What is themost powerful database in the world?
  • 8.
  • 9.
    What is aGraph
  • 10.
    Graph Theory Meet Leonhard Euler •Swiss mathematician • Inventor of Graph Theory (1736)
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 20.
  • 21.
    A way ofrepresenting data DATA DATA
  • 22.
    Relational Database Good for: Well-understood datastructures that don’t change too frequently A way of representing data Known problems involving discrete parts of the data, or minimal connectivity
  • 23.
    Graph Database Relational Database Good for: Well-understood datastructures that don’t change too frequently Known problems involving discrete parts of the data, or minimal connectivity A way of representing data Good for: Dynamic systems: where the data topology is difficult to predict Dynamic requirements: the evolve with the business Problems where the relationships in data contribute meaning & value
  • 24.
    Queries can takenon-sequential, arbitrary paths through data Real-time queries need speed and consistent response times Queries must run reliably with consistent results Q A single query can touch a lot of data Relationship Queries Strain Traditional Databases 24
  • 25.
    is a highlyscalable, native graph database. Neo4j gives any organization the ability to leverage connections in data — in real-time to create value
  • 26.
    Our core beliefis — connections between data are as important as the data itself
  • 27.
    Use of dataconnections has created industry leaders
  • 28.
    Today, as processesget digitized and interconnected - we see the emergence of a “connected enterprise”
  • 30.
  • 31.
    PAY- MENTS SALES- CHANN ELS SUPPLY CHAIN PRO- DUCTS MARKE- TING CRM SALES- CHANN ELS Store Mobile Webstore BEYON D MOBILE Augmented Reality Smart products Connectedhomes CONNECTED ENTERPRISE SUPPLY CHAIN PRODU CTS Shipping Inventory Express goods Home delivery Ratings Price-range Category Credit Card Cash Mobile Pay Purchase History Returns reviews Tweet s Emails Customer support Content Promotions Online advertising Loyalty Programs Feedback
  • 32.
    CONSUME R DATA PRODUCT DATA PAYMENT DATA SOCIAL DATA SUPPLIER DATA “The next waveof competitive advantage will be all about using connections to produce actionable insights.”
  • 33.
    Real-Time Recommendations Dynamic Pricing Actionableinsights power new use cases Artificial Intelligence & IoT-applications Fraud Detection Network ManagementCustomer Engagement Supply Chain Efficiency Identity and Access Management
  • 34.
  • 35.
    Neo4j Property Graph TheWhiteboard Model is the Physical Model 35 A unified view for ultimate agility • Easily understood • Easily evolved • Easy collaboration between business and IT
  • 36.
    Equivalent Cypher Query MATCH(you)-[:BOUGHT]->(something)<-[:BOUGHT]-(other)-[:BOUGHT]->(reco) WHERE id(you)={id} RETURN reco Traversal Speeds on Amazon Retail Dataset Threads Hops per second 1 3-4 million 10 17-29 million 20 34-50 million 30 36-60 million 36 Social Recommendation Example Neo4j Advantage - Performance
  • 37.
    Neo4j: Native Graphfrom the Start Native graph storage Optimized for real-time reads and ACID writes • Relationships stored as physical objects, eliminating need for joins and join tables • Nodes connected at write time, enabling scale- independent response times Native graph querying Memory structures and algorithms optimized for graphs • Index-free adjacency enables 1M+ hops per second via in- memory pointer chasing • Off-heap page cache improves operational robustness and scaling compared with JVM-based caches • “Minutes to milliseconds” performance improvement Neo4j Advantage - Performance Neo4j Advantage - ACID Transactions
  • 38.
    Neo4j: Built forthe Enterprise Native Graph Storage Designed, built, and tested for graphs Native Graph Query Processing For real-time, relationship-based apps Evaluate millions of relationships in a blink Whiteboard-Friendly Data Modeling Faster projects compared to RDBMS ACID Transactions and Security Fully ACID transactions, causal consistency and enterprise security Powerful, Expressive Query Language Improved productivity, with 10x to 100x less code than SQL Causal Clustering Architecture Architecture provides ideal balance of performance, availability, scale for graphs Built-in Data Import Seamless import from other databases Drivers for Popular Language and Platforms Fits easily into your IT environment, with drivers and APIs for popular languages MATCH (A)38
  • 39.
    Neo4j: Right forthe Enterprise ACID Transactions • ACID transactions with causal consistency • Neo4j Security Foundation delivers enterprise-class security and control Hardware Efficiency • Native graph query processing and storage requires 10x less hardware • Index-free adjacency requires 10x less CPU Agility • Native property graph model • Modify schema as business changes without disrupting existing data Developer Productivity • Easy to learn, declarative openCypher graph query language • Procedural language extensions • Open library of procedures and functions APOC • Neo4j support and training • Worldwide developer community … all backed by Neo’s track record of leadership and product roadmap Performance • Index-free adjacency delivers millions of hops per second • In-memory pointer chasing for fast query results
  • 40.
    Neo4j in Action Real-time PackageRouting • Large postal service with over 500k employees • Neo4j routes 7M+ packages daily at peak, with peaks of 5,000+ routing operations per second. Real-time promotion recommendations • Record “Cyber Monday” sales • About 35M daily transactions • Each transaction is 3-22 hops • Queries executed in 4ms or less • Replaced IBM Websphere commerce Real-time pricing engine • 300M pricing operations per day • 10x transaction throughput on half the hardware compared to Oracle • Presentation at http://graphconnect.com/gc2016-sf/ • Replaced Oracle database
  • 41.
    Routing Recommendations Don’t TakeOur Word For It Examples of companies that use Neo4j, the world’s leading graph database, for recommendation and personalization engines. Adidas uses Neo4j to combine content and product data into a single, searchable graph database which is used to create a personalized customer experience “We have many different silos, many different data domains, and in order to make sense out of our data, we needed to bring those together and make them useful for us,” – Sokratis Kartelias, Adidas eBay Now Tackles eCommerce Delivery Service Routing with Neo4j “We needed to rebuild when growth and new features made our slowest query longer than our fastest delivery - 15 minutes! Neo4j gave us best solution” – Volker Pacher, eBay Walmart uses Neo4j to give customer best web experience through relevant and personal recommendations “As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands”. - Marcos Vada, Walmart Product recommendations Personalization Linkedin Chitu seeks to engage Chinese jobseekers through a game-like user interface that is available on both desktop and mobile devices. “The challenge was speed,” said Dong Bin, Manager of Development at Chitu. “Due to the rate of growth we saw from our competitors in the Chinese market, we knew that we had to launch Chitu as quickly as possible.” Social Network
  • 42.
    Background • Large globalbank • Deploying Reference Data to users and systems • 12 data domains, 18 datasets, 400+ integrations • Complex data management infrastructure Business Problem • Master data silos were inflexible and hard to consume • Needed simplification to reduce redundancy • Reduce risk when data is in consumers’ hands • Dramatically improve efficiency Solution and Benefits • Data distribution flows improved dramatically • Knowledge Base improves consumer access • Ad-hoc analytics improved • Governance, lineage and trust improved • Better service level from IT to data consumers UBS FINANCIAL SERVICES Master Data Management / Metadata42 CE Customer since 2016 Q1 EE Customer since 2015
  • 43.
    Background • Brazil's largestbank, #38 on Forbes G2000 • $61B annual sales 95K employees • Most valuable brand in Brazil • 28.9M credit card & 25.6M debit card accounts • High integrity, customer-centric values Business Problem • Data silos made assessing credit worthiness hard • High sensitivity to fraud activity • 73% of all transactions over internet and mobile • Needed real-time detection for 2,000 analysts • Scale to trillions of relationships Solution and Benefits • Credit monitoring and fraud detection application • 4.2M nodes & 4B relationships for 100 analysts • Grow to 93T relationships for 2000 analysts by 2021 • Real time visibility into money flow across multiple customers Itau Unibanco FINANCIAL SERVICES Fraud Detection / Credit Monitoring43 CE Customer since 2016 Q1 EE Customer since Q2 2017
  • 44.
    Background • Mid-size Germaninsurer founded in 1858 • Project executed by Delvin, a subsidiary of die Bayerische Versicherung and an IT insurance specialist Business Problem • Field sales needed easy, dynamic, 24/7 access to policies and customer data • Existing DB2 system unable to meet performance and scaling demands • Needed 24/7 available system for sales unit outside the company Solution and Benefits • Enabled flexible searching of policies and associated personal data • Raised the bar on industry practices • Delivered high performance and scalability • Ported existing metadata easily die Bayerishe INSURANCE Master Data Management44
  • 45.
    Business Problem • Findrelationships between people, accounts, shell companies and offshore accounts • Journalists are non-technical • Biggest “Snowden-Style” document leak ever; 11.5 million documents, 2.6TB of data Solution and Benefits • Pulitzer Prize winning investigation resulted in robust coverage of fraud and corruption • PM of Iceland resigned, exposed Putin, Prime Ministers, gangsters, celebrities (Messi) • Trials ongoing Background • International Consortium of Investigative Journalists (ICIJ), small team of data journalists • International investigative team specializing in cross-border crime, corruption and accountability of power • Works regularly with leaks and large datasets ICIJ Panama Papers INVESTIGATIVE JOURNALISM Fraud Detection / Graph-Based Search45
  • 46.
    Background • Global financialservices firm with trillions of dollars in assets • Varying compliance and governance considerations • Incredibly complex transaction systems, with ever-growing opportunities for fraud Business Problem • Needed to spot and prevent fraud detection in real time, especially in payments that fall within “normal” behavior metrics • Needed more accurate and faster credit risk analysis for payment transactions • Needed to dramatically reduce chargebacks Solution and Benefits • Lowered TCO by simplifying credit risk analysis and fraud detection processes • Identify entities and connections uniquely • Saved billions by reducing chargebacks and fraud • Enabled building real-time apps with non-uniform data and no sparse tables or schema changes London and New York Financial FINANCIAL SERVICES Fraud Detection s 46
  • 47.
    Background • One ofthe world’s oldest and largest banks • 100+ year-old bank with more than 1000 predecessor institutions • 500,000 employees and contractors • Needed to manage and visualize ~50,000 Unix servers in its network Business Problem • Original RDBMS solution could handle only 5,000 servers • Improve net performance company-wide • Leverage M&A legacy systems with no room for error Solution and Benefits • Store UNIX server and network config in Neo4j • Combine Splunk log data into an application that visualizes events on the network • Neo4j vastly improved app performance • New apps built much faster with Neo4j than SQL Large Investment Bank FINANCIAL SERVICES Network and IT Operations47
  • 48.
    What the analystssay “Neo4j is the clear leader in the property graph space” Forrester Research has named Neo4j “Most Popular Graph Database” Part of Gartner’s “Operational Database” Magic Quadrant since 2014
  • 49.
    The Neo4j ecosystem Over250 customers 400+ yearly events and meetups 100k+ users and developers 400+ Startups
  • 50.
    Please get intouch Bill Brooks bill@neo4j.com