4. Frederik Obermaier, Süddeutsche Zeitung, on the
importance of networks in journalism. From Panel at
Columbia University Feb 23, 2018.
“I’ve only come
across 3 or 4
stories in my
career that
weren’t about
networks.”
7. 2.6 TB
11.5 million documents
Emails, Scanned Documents,
Bank Statements etc… Person
B
Bank US
Account
123
Person
A
Acme
Inc
Bank
Bahama
s
Address
XNODE
RELATIONSHIP
8. 2.6 TB
11.5 million documents
Emails, Scanned Documents,
Bank Statements etc…
13. Common Graph Use Cases
Fraud
Detection
Real-Time
Recommendations
Network & IT
Operations
Master Data
Management
Knowledge
Graph
Identity & Access
Management
airbnb
14. “Forrester estimates that over
25% of enterprises will be using
graph databases by 2017.”
Forrester, 2014
15. Popularity of Graphs
DB-engines Ranking of Database Categories
• Graph DBMS
• Key-value stores
• Document stores
• Wide column store
• RDF stores
• Time stores
• Native XML DBMS
• Object oriented DBMS
• Multivalue DBMS
• Relational DBMS
Graph DB
2013 2014 2015 2016 2017 2018 2019
16. Trend No. 5: Graph
…
The application of graph processing and graph DBMSs will grow at 100
percent annually through 2022 to continuously accelerate data preparation
and enable more complex and adaptive data science.
…
Graph analytics will grow in the next few years due to the need to ask
complex questions across complex data, which is not always practical
or even possible at scale using SQL queries.
https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo
February 18, 2019
22. Strictly ConfidentialStrictly Confidential
Strategic initiative, led by Thomas Kurian, CEO of Google
Cloud
• Goal to be #2 Enterprise Cloud as the “open source
friendly” alternative to AWS
• Work with known/proven leaders across key areas
• Neo4j/GCP integrated solution beta by EoY 2019
• Initial release of Neo4j DBaaS will be available via the
Google marketplace
22
Google Cloud Partnership
• Fully managed services running in the cloud, with best efforts made to optimize performance
and latency between the service and application.
• A single user interface to manage apps, which includes the ability to provision and manage
the service from the Google Cloud Console.
• Unified billing, so you get one invoice from Google Cloud that includes the partner’s service.
• Google Cloud support to manage and log support tickets in a single window and not have to
deal with different providers.
23. Retail
7 of top 10
Finance
20 of top 25 7 of top 10
Software
Hospitality
3 of top 5
Telco
4 of top 5
Airlines
3 of top 5
Logistics
3 of top 5
76%
FORTUNE 100
have adopted or
are piloting Neo4j
24.
25. Neo4j Startup Program Expansion
• Free access for startups with up to 50 employees;
under $3M in revenue
• Neo4j Enterprise Edition
• Neo4j Bloom
• Apply at http://neo4j.com/startup-program
• Notable alumni include:
Medium
26.
27. Background
• Over 7M citizens suffer from Diabetes
• Connecting over 400 researchers
• Incorporates over 50 databases, 100k’s of Excel
workbooks, 30 database of biological samples
• Sought to examine disease from as many angles as
possible.
Business Problem
• Genes are connected by proteins or to metabolites,
and patients are connected with their diets, etc…
• Needed to improve the utilization of immensely
technical data
• Needed to cater to doctors and researchers with
simple navigation, communication and connections
of the graph.
Solution and Benefits
• Dr. Alexander Jarasch, Head of Bioinformatics and
Data Management
• Scientists can conduct parallel research without asking
the same questions or repeating tests
• Built views like a liver sample knowledge graph
DZD - German Center for Diabetes Research
Medical Genomic Research27
EE Customer since 2016 Q
28. Background
• Fortune 100 heavy equipment manufacturer
• 27 Million warranty & service documents parsed
• Foundation for AI-based supply chain management
Business Problem
• Improve maintenance predictability
• Need a knowledge base for 27 million warranty
documents and maintenance orders
• Graphs gather context for AI to identify ‘prime
examples’ of connections among parts, suppliers,
customers and their mechanics anticipate when
equipment will need servicing and by whom.
Solution and Benefits
• Text to knowledge graph
• Common ontology for complaints, symptoms & parts
• Anticipates when equipment will need servicing
• Improves customer and brand satisfaction
• Maximizes lifespan and value of equipment
Caterpillar Heavy Equipment Manufacturing
Parts Assembly & Equipment Maintenance28
29. Background
• Social network of 10M graphic artists
• Peer-to-peer evaluation of art and works-in-progress
• Job sourcing site for creatives
• Massive, millions of updates (reads & writes) to Activity
Feed
• 150 Mongos to 48 Cassandras to 3 Neo4j’s!
Business Problem
• Artists subscribe, appreciate and curate “galleries” of
works of their own and from other artists
• Activities Feed is how everyone receives updates
• 1st implementation was 150 MongoDB instances
• 2nd implementation shrunk to 48 Cassandras, but it
was still too slow and required heavy IT overhead
Solution and Benefits
• 3rd implementation shrunk to 3 Neo4j instances
• Saved over $500k in annual AWS fees
• Reduced data footprint from 50TB to 40GB
• Significantly easier to introduce new features like,
“New projects in you Network”
Adobe Behance Social Network of 10M Graphic Artists
Social Network29
EE Customer since 2016 Q
32. Background
• Largest Cable TV & Internet Provider in US
• 3rd Largest network on the planet
• xFi is consumer experience in 3M houses
• Internet, router, devices, security, voice & telephony
• Transformational customer experience
Business Problem
• Integrate all experience in a smart home
• Create innovative ideas based on cross-platform and
household member preferences
• Add integrated value of xFinity triple play & quad-
play services (internet, VoIP, cable TV & home
security)
Solution and Benefits
• Custom content per household member
• Security reminders (kids are home, garage left open)
• Serves millions of households
• Makes content recommendations based on occupant,
time of day, permissions and preferences
• Has Siri-like voice commands
COMCAST Xfinity xFi TELECOMMUNICATIONS
Smart Home / Internet of Things32
EE Customer since 2016 Q
35. Strictly ConfidentialStrictly Confidential
The Market Sees Strong Synergy between Graphs and
Artificial Intelligence
35
AI research papers focused on graphs
SURGING
INTEREST
New Book:
20K Downloads in first 2 weeks
CONNECTED
CONTEXT FOR AI/ML
CUSTOMER
TRACTION
German Center for
Diabetes Research
41. “Increasingly we're learning that you can make
better predictions about people by getting all
the information from their friends and their
friends’ friends than you can from the
information you have about the person
themselves”
United Health
Amazon, Dell, Vmware, Xylinx, Lyft,
Capgemini, Deloitte
Healthcare: Aeta, , CVS, United Health, Pfizer, Novartis, Athena Helath
FINSL State St., Travelers, Bank of America, John Hancock, Citizens Bank, Fidelty
Bose, IBM
Kent St. Northeastern, Harvard
Capgemini, EY,
“The first story is about the Panama Papers, which was the biggest news story of 2016, but its impact is still very live: a couple of months ago the prime minister of Pakistan resigned over findings in the Panama Papers, and just last week he was actually formally indicted for corruption.”
“In this particular story, the heroes are two journalists at the Suddeutsche Zeitung who were provided with a”<click>
“2.6 TB of leaked, that supposedly contained data detailing accounts and activities of the powerful and the wealthy for legal tax planning, but possibly also for illegal tax evasion.”
“So they got this 2.6 TB huge data dump of spaghetti information and they wanted to make sense of that. They ran it through an open source pipeline of technologies and ended up with”<click>”11 MILLION documents, which btw is the largest leak in journalistic history. In these documents are emails, bank accounts, names, addresses etc, and they have to make sense of all that and uncover any newsworthy stories.”
“Now let’s take a step back from data and technology and just think about what investigative journalism is. IJ is all about finding patterns. Here’s an example of a pattern:”<click> Person has Account with Bank. Yadayada, nothing wrong. Blabla lives on address.
“Now if we look at this more abstract we can see that we have concepts and how they are related to each other.”
“In the graph world we call these<click>Nodes and<click>Relationships.”
“It turns out with these very simple abstractions — <enumerate them> — we can build and model *everything*. It turns out that this model is very flexible. Easy to evolve. Etc.”
… “and your data model will organically evolve with you as as your needs change.”
“What’s equally amazing is if you wrap this data model in an infrastructure that can support not just 7 nodes but”<click>
“a million nodes, or 11.5 million nodes, or a billion nodes, or 100 billion nodes.”
“Ok, so back to our story. Remember that second pattern we discussed before, where someone was connected through his wife to an offshore bank account. Well, here’s the real world example of that: the Icelandic prime minister Sigmundur Gunnlaugsson. Excuse me! The *former* prime minister of Iceland. That’s the type of impact the Panama Papers had.”
“As mentioned, it rapidly became one of the biggest news stories last year and was written up in virtually every major newspaper in every country in the world.”
And of course when they do something like this something-something last month
At the time, this was considered a bold and shocking prediction.
60% -> 85%
Put together by our friends at GraphAware.
We see this at Neo4j, where as of today 76% of the F100 have either piloted or adopted Neo4j! That’s a staggering amount.
But that’s not enough. As of right now, most of the leading organizations in most of the biggest verticals in the world rely on Neo4j. We already talked about Software and Insurance, but just to give you a sense: 20 of the top 25 global financial services organizations (and 20 of the top 20 US banks) are using Neo4j, 4 of the top 5 telcos and 3 of the top 5 airlines. Graphs have truly arrived in the enterprise.
“And today, we have 470 startups in that program. Look at these logos. You may not recognize all of them, or maybe even one of them. But everyone of them has the power of Google in their hand. And I’ll be willing to bet that at least one out of these 470 startups will become a household name in the next ten years.”
I’d like to close with a topic that you’ve all heard about, and that many of you may already be working in, and that’s AI. And more precisely, how graphs are starting to be used in AI.
Those of you who were here last year may remember this picture.
It’s a taxonomy of different kinds of machine learning. What’s really obvious looking at the images it’s very clear that graphs are foundational for Machine Learning!
This then begs the question: how can I use graphs to help with own my AI problem.
Why do other databases also talk about the same use cases?
SHOUT OUT TO CATERPILLAR
The answer is context. Graphs provide the power of connections & context to the ML and AI that you use today
Last year we zoomed into one very important area in AI which is knowledge graphs. A number of customers are using Neo4j for their knowledge graph, including these four who have all spoken about their knowledge graphs at GraphConnect.
[[ worth defining knowledge graphs, verbally or visually? We didn’t here because it adds time & complexity]
(Fact check:
eBay spoke about their knowledge at GraphConnect NYC ’17
Airbnb presented theirs at GraphConnect Europe ’17
NASA presented their at GraphConnect SF ‘16
And Cisco at GraphConnect SF ‘15, though at the time they used the term Metadata graph
)
We’ve talked about knowledge graphs before and you probably understand those. Let’s therefore look at machine learning. Those have you who have seen this before will recognize this as a typical machine learning pipeline. You train it by feeding it data. That data is input as features or vectors, and once it’s trained you put it into production and you’re off to the races.
James Fowler
What you really want is this… and it turns out there are a number of ways to make this easily possible using Neo4j alongside the tools you already have
This is called: “Connected Feature Extraction”
And there are three distinct techniques that are covered throughout the day, with a great summary by Jake Graham and Amy Hodler.
This is called: “Connected Feature Extraction”
And there are three distinct techniques that are covered throughout the day, with a great summary by Jake Graham and Amy Hodler.
EY – great partner for years, many joint customers
GA – longstanding partner expanded from EMEA. Help with full lifecycle from evaluating a use case through deployment.
“We have an exciting day ahead of us. Let me take this first hour to take a step back and talk a little about the state of the graph space today, and much more importantly talk about where I believe the space is going.
“It’s been a year since we had a GraphConnect here in the US, and what a year it has been. Graphs have had an impact on an order that we’ve never seen before. Let me give you a couple of examples.”