Neo4j GraphTalks event on November 2016 included:
1) An introduction to graph databases and Neo4j by Bruno Ungermann from Neo4j.
2) Darko Krizic from PRODYNA AG presenting their experience implementing a global knowledge hub for product information using Neo4j.
3) An open networking session.
2. Neo4j GraphTalks
• 09:00-09:30 Frühstück und Networking
• 09:30-10:15 Einführung in Graph-Datenbanken und Neo4j
(Bruno Ungermann, Neo4j)
• 10.15-11.00 ADAMA: Weltweiter Knowledge-Hub für Produktinformationen
Erfahrungswerte aus der Implementierung und Demo
(Darko Krizic, CTO PRODYNA AG)
• Open End
10. “We found Neo4j to be literally thousands of times faster
than our prior MySQL solution, with queries that require
10-100 times less code. Today, Neo4j provides eBay with
functionality that was previously impossible.”
- Volker Pacher, Senior Developer
“Minutes to milliseconds” performance
Queries up to 1000x faster than other tested database types
Speed
11. Discrete Data
Minimally
connected data
Neo4j is designed for data relationships
Other NoSQL Relational DBMS Neo4j Graph DB
Connected Data
Focused on
Data Relationships
Development Benefits
Easy model maintenance
Easy query
Deployment Benefits
Ultra high performance
Minimal resource usage
Use the Right Database for the Right Job
12. 2000 2003 2007 2009 2011 2013 2014 20152012
GraphConnect,
first conference for
graph DBs
First
Global 2000
Customer
Introduced
first and only
declarative query
language for
property graph
Published
O’Reilly
book
on Graph
Databases
First
native
graph DB
in 24/7
production
Invented
property
graph
model
Contributed
first graph DB
to open
source
Extended
graph data
model to
labeled
property
graph
150+ customers
50K+ monthly
downloads
500+ graph
DB events
worldwide
Neo4j: The Graph Database Leader
15. “Forrester estimates that over 25% of enterprises will be using graph
databases by 2017”
“Neo4j is the current market leader in graph databases.”
“Graph analysis is possibly the single most effective competitive
differentiator for organizations pursuing data-driven operations and
decisions after the design of data capture.”
IT Market Clock for Database Management Systems, 2014
https://www.gartner.com/doc/2852717/it-market-clock-database-management
TechRadar™: Enterprise DBMS, Q1 2014
http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801
Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates)
http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/
Neo4j Leads the Graph Database Revolution
20. Adidas Shared Meta Data Service
20 Knowledge Management
Background
• Global leader in sporting goods industry services
firm footware, apparel, hardware, 14.5 bln sales,
53,000 people
• Multitude of products, markets, media, assets and
audiences
Business Problem
• Beset by a wide array of information silos including
data about products, markets, social media, master
data, digital assets, brand content and more
• Provide the most compelling and relevant content to
consumers
• Offering enhanced recommendations to drive
revenue
Solution and Benefits
• Save time and cost through stadardized access to
content sharing-system with internal teams, partners,
IT units, fast, reliable, searchable avoiding
reduandancy
• Inprove customer experience and increase revenue by
providing relevant content and recommentations
21. Background
• Toy Manufacturer, founded 80+ years ago, plastic
figurines sold in 50+ countries
• 100 Mio, 250 employees
• Production Process in different countries like China
• Polymer Processing, Children‘s toys, high
responsibility
Business Problem
• Product related data stored in many different data
stores including SAP, Navision, Laboratory
Systems, Document Systems, Powerpoint, Excel..
• Hard to find correct answers for authorities, ,
internally, parents
Solution and Benefits
• Neo4j powers integrated platform that provides
visibility across whole supply chain
• Domain Experts create and evolve data model
• Correct answers within seconds
Schleich Product Information Management
21 Knowledge Management
22. Schleich: transparency & reliable answers
Is there a critical substance ?
?
product
materials
substances
lab tests
measured values
statutory thresholds
law
local context
batches
Bartagame14675
(Charge 11A1)
processing steps
23. Schleich example:
joint development by domain experts & architects
law XYZ
product idea
briefing boardconcept board
budget
project
model
project
profile
product
version
product
components
(bill of material BOM)
chemical risk
assessment
component X
tool
technical
specification and
documents
production process
approval
approval
approval
24. Background
• Mid-size German insurer founded in 1858
• Project executed by Delvin, a subsidiary
of die Bayerische Versicherung and an IT insurance
specialist
Business Problem
• Field sales needed easy, dynamic, 24/7 access to
policies and customer data
• Existing DB2 system unable to meet performance
and scaling demands
Solution and Benefits
• Enabled flexible searching of policies and associated
personal data
• Raised the bar on industry practices
• Delivered high performance and scalability
• Ported existing metadata easily
Die Bayerische Versicherung INSURANCE
Knowledge Management24
25. Background
• Leading European Airline
• 100+ mln passengers
• 2+ mln tons freight per year
• 700+ aircrafts
Business Problem
• Need for flexible high performant Inflight Asset
Management, onboard entertainment, byod
• Complex data set: CMDB, CMS, Aircraft data feed,
media library
• Maintain individual configuration for each Aircraft
• Complex data model, aircrafts, hardware, vitual
containers, licenses, business rules, versions,
content ...
Solution and Benefits
• Neo4j powers integrated platform that provides fast
access to all aspects needed to maintain complex
system
• Fast implementation
• Higly flexible data model enable constant evolution
Lufthansa Digital Asset Mangagement
25 Graph Based Search, Knowledge Managment
35. Business Problem
• Optimize walmart.com user experience
• Connect complex buyer and product data to gain
super-fast insight into customer needs and product
trends
• RDBMS couldn’t handle complex queries
Solution and Benefits
• Replaced complex batch process real-time online
recommendations
• Built simple, real-time recommendation system with
low-latency queries
• Serve better and faster recommendations by
combining historical and session data
Background
• Founded in 1962 and based in Arkansas
• 11,000+ stores in 27 countries with walmart.com
online store
• 2M+ employees and $470 billion in annual
revenues
Walmart RETAIL
Real-Time Recommendations35
36. Background
• One of the world’s largest logistics carriers
• Projected to outgrow capacity of old system
• New parcel routing system
Single source of truth for entire network
B2C and B2B parcel tracking
Real-time routing: up to 7M parcels per day
Business Problem
• Needed 365x24x7 availability
• Peak loads of 3000+ parcels per second
• Complex and diverse software stack
• Need predictable performance, linear scalability
• Daily changes to logistics network: route from any
point to any point
Solution and Benefits
• Ideal domain fit: a logistics network is a graph
• Extreme availability, performance via clustering
• Greatly simplified routing queries vs. relational
• Flexible data model reflect real-world data variance
much better than relational
• Whiteboard-friendly model easy to understand
Accenture LOGISTICS
36 Real-Time Routing Recommendations
37. Background
• San Jose-based communications equipment giant
ranks #91 in the Global 2000 with $44B in annual
sales
• Needed real-time recommendations to encourage
knowledge base use on company’s support portal
Solution and Benefits
• Faster problem resolution for customers and
decreased reliance on support teams
• Scrape cases, solutions, articles et al continuously for
cross-reference links
• Provide real-time reading recommendations
• Uses Neo4j Enterprise HA cluster
Business Problem
• Reduce call-center volumes and costs via improved
online self-service quality
• Leverage large amounts of knowledge stored in
service cases, solutions, articles, forums, etc.
• Reduce resolution times and support costs
Cisco COMMUNICATIONS
Real-Time Recommendations
Solution
Support
Case
Support
Case
Knowledge
Base Article
Message
Knowledge
Base Article
Knowledge
Base Article
37
40. Background
• Second largest communications company
in France
• Based in Paris, part of Vivendi Group, partnering
with Vodafone
Solution and Benefits
• Flexible inventory management supports modeling,
aggregation, troubleshooting
• Single source of truth for entire network
• New apps model network via near-1:1 mapping
between graph and real world
• Schema adapts to changing needs
Network and IT Operations
SFR COMMUNICATIONS
Business Problem
• Infrastructure maintenance took week to plan due
to need to model network impacts
• Needed what-if to model unplanned outages
• Identify network weaknesses to uncover need for
additional redundancy
• Info lived on 30+ systems, with daily changes
LINKED
LINKED
DEPENDS_ON
Router Service
Switch Switch
Router
Fiber Link Fiber Link
Fiber Link
Oceanfloor
Cable
40
41. Business Problem
• Original RDBMS solution could handle only 5,000
servers
• Improve net performance company-wide
• Leverage M&A legacy systems with no room
for error
Solution and Benefits
• Store UNIX server and network config in Neo4j
• Combine Splunk log data into an application
that visualizes events on the network
• Neo4j vastly improved app performance
• New apps built much faster with Neo4j than SQL
Large Investment Bank FINANCIAL SERVICES
Network and IT Operations41
Background
• One of the world’s oldest and largest banks
• 100+ year-old bank with more than 1000
predecessor institutions
• 500,000 employees and contractors
• Needed to manage and visualize ~50,000 Unix
servers in its network
42. Identity Relationship ManagementIdentity Access Management
Applications
and data
Endpoints
People
Customers
(millions)
Partners and
Suppliers
Workforce
(thousands)
PCs Tablets
On-premises Private Cloud Public Cloud
Things
(Tens of
millions)
WearablesPhones
PCs
Customers
(millions)
On-premises
Applications
and data
Endpoints
People
43. Background
• Oslo-based telcom provider is #1 in Nordic
countries and #10 in world
• Online, mission-critical, self-serve system lets
users manage subscriptions and plans
• availability and responsiveness is critical to
customer satisfaction
Business Problem
• Logins took minutes to retrieve relational
access rights
• Massive joins across millions of plans,
customers, admins, groups
• Nightly batch production required 9 hours and
produced stale data
Solution and Benefits
• Shifted authentication from Sybase to Neo4j
• Moved resource graph to Neo4j
• Replaced batch process with real-time login response
measured in milliseconds that delivers real-time data,
vw yday’s snapshot
• Mitigated customer retention risks
Identity and Access Management
Telenor COMMUNICATIONS
SUBSCRIBED_BY
CONTROLLED_BY
PART_O
F
USER_ACCESS
Account
Customer
CustomerUser
Subscription
43
44. Background
• Top investment bank with $1+ trillion in assets
• Using a relational database and Gemfire to manage
employee permissions to research document and
application-service resources
• Permissions for new investment managers and
traders provisioned manually
Business Problem
• Lost an average of 5 days per new hire while they
waited to be granted access to hundreds of
resources, each with its own permissions
• Replace an unsuccessful onboarding process
implemented by a competitor
• Regulations left no room for error
Solution and Benefits
• Store models, groups and entitlements in Neo4j
• Exceeded performance requirements
• Major productivity advantage due to domain fit
• Graph visualization ease permissioning process
• Fewer compromises than with relational
• Expanded Neo4j solution to online brokerage
UBS FINANCIAL SERVICES
Identity and Access Management44
46. Revolving Debt
Number of Accounts
Normal behavior
Fraud Detection With Connected Analysis
Fraudulent pattern
47. Background
• Global financial services firm with trillions of dollars
in assets
• Varying compliance and governance
considerations
• Incredibly complex transaction systems, with ever-
growing opportunities for fraud
Business Problem
• Needed to spot and prevent fraud detection in real
time, especially in payments that fall within “normal”
behavior metrics
• Needed more accurate and faster credit risk analysis
for payment transactions
• Needed to dramatically reduce chargebacks
Solution and Benefits
• Lowered TCO by simplifying credit risk analysis and
fraud detection processes
• Identify entities and connections uniquely
• Saved billions by reducing chargebacks and fraud
• Enabled building real-time apps with non-uniform data
and no sparse tables or schema changes
London and New York Financial FINANCIAL SERVICES
Fraud Detection
s
47
48. Background
• Panama based lawyers Mossack & Fonseca do
business in hosting “letterbox companies”
• Suspected to support tax saving and organized
crime
• Altogether: 2.6 TB, 11 milo files, 214.000 letter box
companies
Business Problem
• Goal to unravel chains Bank-Person–Client–
Address–Intermediaries – M&F
• Earlier cases: spreadsheet based analysis (back-
and-forth) & pencil to extract such connections
• This case: sheer amount of data & arbitrarily chain
length condemn such approaches to fail
Solution and Benefits
• 400 journalists, investigate/update/share, 2 people
with IT background
• Identify connections quickly and easily
• Fast Results wouldn‘t be possible without GraphDB
Panama Papers Fraud Detection
Fraud Detection48
More concrete and closer to reality
Flexible
, no Fixed Schema
And deriving value from data-relationships is exactly what some of the most successful companies in the world have done.
Google created perhaps the most valuable advertising system of all time on top of their search-enginge, which is based on relationships between webpages.
Linkedin created perhaps the most valuable HR-tool ever based on relationships amongst professional
And this is also what pay-pal did, creating a peer-to-peer transaction service, based on relationships.
When it comes to shopping online, probably the most important feature is the product recommendations you make, because they will have a direct impact on your sales.
Off course, we all know Amazon has set the standard for how online-recommendations work. In this example we see a user who’s looking to buy a “Kitchen Aid”. And normally you would see recommendations based on “Related Products” or something like “People who bought product X also bought product Y”.
This would be a classical retail recommendation. This is also very easy to model with a graph.
The question here is though, if this is a limited way of looking at recommendations? – because you risk leaving out a lot of information about your user that actually affects what a good recommendation is.
For example — it makes NO SENSE, recommending a product to user that you know he or she has made complaints about. And this is information you have in your CRM-data.
It makes NO SENSE to recommend a product to a customer who has already bought and returned a product. It’s just a bad recommendation.
It makes NO SENSE to recommend a product that you don’t have on stock.
Say this is a Christmas-present… it makes NO SENSE recommending a product that you can’t deliver on time for christmas. And you know this, because your logistics data will tell you this.
And lastly, this person looking at a kitchen aid, could be an anomaly, it doesn't necessarily mean he’s interested in kitchen stuff at all. And your Payments and Purchase data will tell you this. Perhaps this persons pattern tells you that what you should actually recommend is “Surfing Equipment” or…
…Smart TV’s.
The important thing to remember is, the more you know about your consumer, the more relevant your recommendations will be, the better the chance is that you’ll actually be able to make a sale. And this is a numbers game – and once you start doing this on scale…
When we say that networks are graphs, we mean that networks by default are entities that are connected. If you do a quick search on “network topology” you basically end up with a display of a bunch of graphs…
And if we zoom in on one of them, which seems to be a mesh network of some sort, with routers, gateways — this would be very easy to translate and model into a graph in Neo4j.
So let’s see what’s happening in the the world of IAM.
Access Management used to be pretty straight forward. And the IAM-processes used to represent a pretty simplistic world of what access meant. People accessed applications hosted on-premiss, through specific devices. And in a scenario like this one, access management isn’t really that complicated.
Today, this is simply not a reality. As we discussed previously, 1) people take on several different roles, 2) and (even if you don’t think about it) they will be connected and require secure access to millions of things, they will use different types of devices with different types of dependencies, 3) and all of these individuals and roles will expect to access and use services and applications in a very granular and personalized way.
So all of this is, of course, highly interconnected.
And all these relationships have tremendous value. and your IAM-processes has an enormously important role to play, and from many different perspectives.
…And I think this picture show you that what’s emerging are the incredibly rich data-relationships between people and things, and the different personas of people and things, and the job of IAM is going to be to use these relationships to manage who gets access to what — whether it is about accessing data coming from an IOT device or whether it’s about access to control devices remotely, or whether a device should have access to a cloud API or whether a person could share information with another person, etc… In all these different scenarios you can provide a richer experience by leveraging these relationships between all these people and things and be able to play out these different scenarios and ask those questions in real-time.
This is what the world looks like, and it’s scaling rapidly. We’re going to reach an environment where we’ll see connected devices and people by the billions, so just imagine how many data-relationships that have to be in place to make sense of all this, knowing that when devices are being connected, if they’re not properly secured, it’s a huge risk from a privacy and cyber security point of view.
So data-relationships are going to be a key part of the future when we build IAM-systems and when managing digital identity.
And, an enterprise who doesn’t appreciate and understand the full complexity of who the customers are in an environment like this, will probably start faltering quite quickly.
So it’s very exciting times for IAM, and especially for graph databases within IAM. I think how we securely manage these billions of relationships between users and things, and collaborators, employees, customers and consumers is going to be one of the epic undertakings of the future.
[In this simple fraud detection approach to detect credit card fraud, it is relatively easy to spot outliers. But what if the fraudster commits fraud while still exhibiting normal behavior. Well - this is exactly how fraud rings operate]
[A fraud ring rarely strays outside the normal behavior band. Instead they operate within normal limits and commit widespread fraud. This is very hard to detect by systems that are looking for outliers or activities outside the normal band.]