Relationships are highly predictive of behavior, yet most data science models overlook this information because it's difficult to extract network structure for use in machine learning (ML).
With graphs, relationships are embedded in the data itself, making it practical to add these predictive capabilities to your existing practices.
That’s why we’re presenting and demoing the use of graph-native ML to make breakthrough predictions. This will cover:
- Different approaches to graph feature engineering, from queries and algorithms to embeddings
- How ML techniques leverage everything from classical network science to deep learning and graph convolutional neural networks
- How to generate representations of your graph using graph embeddings, create ML models for link prediction or node classification, and apply these models to add missing information to an existing graph/incoming data
- Why no-code visualization and prototyping is important
A comparison of relational and graph model theories, with an eye towards DataStax's implementation of Graph. Note: I'm working on a concise, formal mathematical definition of relational, based on Codd's 1970 paper. (Thanks to Artem Chebotko for suggesting this.)
A comparison of relational and graph model theories, with an eye towards DataStax's implementation of Graph. Note: I'm working on a concise, formal mathematical definition of relational, based on Codd's 1970 paper. (Thanks to Artem Chebotko for suggesting this.)
Software Analytics with Jupyter, Pandas, jQAssistant, and Neo4j [Neo4j Online...Markus Harrer
Let’s tackle problems in software development in an automated, data-driven and reproducible way!
As developers, we often feel that there might be something wrong with the way we develop software. Unfortunately, a gut feeling alone isn’t sufficient for the complex, interconnected problems in software systems.
We need solid, understandable arguments to gain budgets for improvement projects or to defend us against political decisions. Though, we can help ourselves: Every step in the development or use of software leaves valuable, digital traces. With clever analysis, these data can show us root causes of problems in our software and deliver new insights – understandable for everybody.
If concrete problems and their impact are known, developers and managers can create solutions and take sustainable actions aligned to existing business goals.
In this meetup, I talk about the analysis of software data by using a digital notebook approach. This allows you to express your gut feelings explicitly with the help of hypotheses, explorations and visualizations step by step.
I show the collaboration of open source analysis tools (Jupyter, Pandas, jQAssistant and, of course, Neo4j) to inspect problems in Java applications and their environment. We have a look at performance hotspots, knowledge loss and worthless code parts – completely automated from raw data up to visualizations for management.
Participants learn how they can translate their unsafe gut feelings into solid evidence for obtaining budgets for dedicated improvement projects with the help of data analysis.
Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.
Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.
We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.
Webinar, 23 July 2020
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
I gave this presentation at DataOps 19 in Barcelona.
You will find information about Neo4j and how to use it with Graph Algorithms for Machine Learning and Artificial Intelligence.
A Connections-first Approach to Supply Chain OptimizationNeo4j
Supply chain optimization is an unusual balancing act that requires finesse, skill and timely data. Every supply chain’s the key questions to be answered are:
What to Buy? -- what are the factors in determining your optimal product mix and set of suppliers.
How much to Buy? -- what are the most and least popular items at any given time interval
When to Buy? -- long lags in delivery timing may tax limit your flexibility and influence your inventory management practices.
We will illustrate an API-based solution that utilizes a Graph database platform to add demonstrable value to Supply Planning.
An introduction to Neo4j and Graph Databases. Learn about the primary use cases for Graph Databases and explore the properties of Neo4j that make those use cases possible.
In diesem Webinar wollen wir einen Überblick über unser Angebot für Data Scientsts geben und zeigen, was heute schon relativ einfach und schnell möglich ist.
Software Analytics with Jupyter, Pandas, jQAssistant, and Neo4j [Neo4j Online...Markus Harrer
Let’s tackle problems in software development in an automated, data-driven and reproducible way!
As developers, we often feel that there might be something wrong with the way we develop software. Unfortunately, a gut feeling alone isn’t sufficient for the complex, interconnected problems in software systems.
We need solid, understandable arguments to gain budgets for improvement projects or to defend us against political decisions. Though, we can help ourselves: Every step in the development or use of software leaves valuable, digital traces. With clever analysis, these data can show us root causes of problems in our software and deliver new insights – understandable for everybody.
If concrete problems and their impact are known, developers and managers can create solutions and take sustainable actions aligned to existing business goals.
In this meetup, I talk about the analysis of software data by using a digital notebook approach. This allows you to express your gut feelings explicitly with the help of hypotheses, explorations and visualizations step by step.
I show the collaboration of open source analysis tools (Jupyter, Pandas, jQAssistant and, of course, Neo4j) to inspect problems in Java applications and their environment. We have a look at performance hotspots, knowledge loss and worthless code parts – completely automated from raw data up to visualizations for management.
Participants learn how they can translate their unsafe gut feelings into solid evidence for obtaining budgets for dedicated improvement projects with the help of data analysis.
Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.
Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.
We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.
Webinar, 23 July 2020
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
I gave this presentation at DataOps 19 in Barcelona.
You will find information about Neo4j and how to use it with Graph Algorithms for Machine Learning and Artificial Intelligence.
A Connections-first Approach to Supply Chain OptimizationNeo4j
Supply chain optimization is an unusual balancing act that requires finesse, skill and timely data. Every supply chain’s the key questions to be answered are:
What to Buy? -- what are the factors in determining your optimal product mix and set of suppliers.
How much to Buy? -- what are the most and least popular items at any given time interval
When to Buy? -- long lags in delivery timing may tax limit your flexibility and influence your inventory management practices.
We will illustrate an API-based solution that utilizes a Graph database platform to add demonstrable value to Supply Planning.
An introduction to Neo4j and Graph Databases. Learn about the primary use cases for Graph Databases and explore the properties of Neo4j that make those use cases possible.
In diesem Webinar wollen wir einen Überblick über unser Angebot für Data Scientsts geben und zeigen, was heute schon relativ einfach und schnell möglich ist.
GraphSummit Toronto: Leveraging Graphs for AI and MLNeo4j
Phani Dathar, Ph.D., Data Science Solution Architect, Neo4j
Relationships are highly predictive of behavior. Graph technology abstracts connections in our data so businesses can apply relationships and network structures to make better predictions. Hear about the journey from graph analytics and machine learning to graph-enhanced AI. We’ll also cover how enterprises are using graph data science in areas such as fraud, targeted marketing, healthcare, and recommendations.
Government GraphSummit: Leveraging Graphs for AI and MLNeo4j
Phani Dathar, Ph.D., Data Science Solution Architect, Neo4j
Relationships are highly predictive of behavior. Graph technology abstracts connections in our data so businesses can apply relationships and network structures to make better predictions. Hear about the journey from graph analytics and machine learning to graph-enhanced AI. We’ll also cover how enterprises are using graph data science in areas such as fraud, targeted marketing, healthcare, and recommendations.
With the introduction of the Neo4j Graph Platform and increased adoption of graph database technology across all industries, now is a better time than ever to get started with graphs.
Join us for this introduction to Neo4j and graph databases. We'll discuss the primary use cases for graph databases and explore the properties of Neo4j that make those use cases possible.
Workshop 1. Architecting Innovative Graph Applications
Join this hands-on workshop for beginners led by Neo4j experts guiding you to systematically uncover contextual intelligence. Using a real-life dataset we will build step-by-step a graph solution; from building the graph data model to running queries and data visualization. The approach will be applicable across multiple use cases and industries.
Similar to Relationships Matter: Using Connected Data for Better Machine Learning (20)
Atelier - Architecture d’applications de Graphes - GraphSummit ParisNeo4j
Atelier - Architecture d’applications de Graphes
Participez à cet atelier pratique animé par des experts de Neo4j qui vous guideront pour découvrir l’intelligence contextuelle. En utilisant un jeu de données réel, nous construirons étape par étape une solution de graphes ; de la construction du modèle de données de graphes à l’exécution de requêtes et à la visualisation des données. L’approche sera applicable à de multiples cas d’usages et industries.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...Neo4j
Romain CAMPOURCY – Architecte Solution, Sopra Steria
Patrick MEYER – Architecte IA Groupe, Sopra Steria
La Génération de Récupération Augmentée (RAG) permet la réponse à des questions d’utilisateur sur un domaine métier à l’aide de grands modèles de langage. Cette technique fonctionne correctement lorsque la documentation est simple mais trouve des limitations dès que les sources sont complexes. Au travers d’un projet que nous avons réalisé, nous vous présenterons l’approche GraphRAG, une nouvelle approche qui utilise une base Neo4j générée pour améliorer la compréhension des documents et la synthèse d’informations. Cette méthode surpasse l’approche RAG en fournissant des réponses plus holistiques et précises.
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...Neo4j
Charles Gouwy, Business Product Leader, Adeo Services (Groupe Leroy Merlin)
Alors que leur Knowledge Graph est déjà intégré sur l’ensemble des expériences d’achat de leur plateforme e-commerce depuis plus de 3 ans, nous verrons quelles sont les nouvelles opportunités et challenges qui s’ouvrent encore à eux grâce à leur utilisation d’une base de donnée de graphes et l’émergence de l’IA.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GraphAware - Transforming policing with graph-based intelligence analysisNeo4j
Petr Matuska, Sales & Sales Engineering Lead, GraphAware
Western Australia Police Force’s adoption of Neo4j and the GraphAware Hume graph analytics platform marks a significant advancement in data-driven policing. Facing the challenges of growing volumes of valuable data scattered in disconnected silos, the organisation successfully implemented Neo4j database and Hume, consolidating data from various sources into a dynamic knowledge graph. The result was a connected view of intelligence, making it easier for analysts to solve crime faster. The partnership between Neo4j and GraphAware in this project demonstrates the transformative impact of graph technology on law enforcement’s ability to leverage growing volumes of valuable data to prevent crime and protect communities.
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
David Pond, Lead Product Manager, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Shirley Bacso, Data Architect, Ingka Digital
“Linked Metadata by Design” represents the integration of the outcomes from human collaboration, starting from the design phase of data product development. This knowledge is captured in the Data Knowledge Graph. It not only enables data products to be robust and compliant but also well-understood and effectively utilized.
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
Delivered by Michael Down at Gartner Data & Analytics Summit London 2024 - Your enemies use GenAI too: Staying ahead of fraud with Neo4j.
Fraudsters exploit the latest technologies like generative AI to stay undetected. Static applications can’t adapt quickly enough. Learn why you should build flexible fraud detection apps on Neo4j’s native graph database combined with advanced data science algorithms. Uncover complex fraud patterns in real-time and shut down schemes before they cause damage.
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxNeo4j
Delivered by Sreenath Gopalakrishna, Director of Software Engineering at BT, and Dr Jim Webber, Chief Scientist at Neo4j, at Gartner Data & Analytics Summit London 2024 this presentation examines how knowledge graphs and GenAI combine in real-world solutions.
BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Future innovation plans include the exploration of uses of EKG + Generative AI.
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanNeo4j
Look beyond the hype and unlock practical techniques to responsibly activate intelligence across your organization’s data with GenAI. Explore how to use knowledge graphs to increase accuracy, transparency, and explainability within generative AI systems. You’ll depart with hands-on experience combining relationships and LLMs for increased domain-specific context and enhanced reasoning.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Launch Your Streaming Platforms in MinutesRoshan Dwivedi
The claim of launching a streaming platform in minutes might be a bit of an exaggeration, but there are services that can significantly streamline the process. Here's a breakdown:
Pros of Speedy Streaming Platform Launch Services:
No coding required: These services often use drag-and-drop interfaces or pre-built templates, eliminating the need for programming knowledge.
Faster setup: Compared to building from scratch, these platforms can get you up and running much quicker.
All-in-one solutions: Many services offer features like content management systems (CMS), video players, and monetization tools, reducing the need for multiple integrations.
Things to Consider:
Limited customization: These platforms may offer less flexibility in design and functionality compared to custom-built solutions.
Scalability: As your audience grows, you might need to upgrade to a more robust platform or encounter limitations with the "quick launch" option.
Features: Carefully evaluate which features are included and if they meet your specific needs (e.g., live streaming, subscription options).
Examples of Services for Launching Streaming Platforms:
Muvi [muvi com]
Uscreen [usencreen tv]
Alternatives to Consider:
Existing Streaming platforms: Platforms like YouTube or Twitch might be suitable for basic streaming needs, though monetization options might be limited.
Custom Development: While more time-consuming, custom development offers the most control and flexibility for your platform.
Overall, launching a streaming platform in minutes might not be entirely realistic, but these services can significantly speed up the process compared to building from scratch. Carefully consider your needs and budget when choosing the best option for you.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Introduction to Pygame (Lecture 7 Python Game Development)
Relationships Matter: Using Connected Data for Better Machine Learning
1. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
1
Relationships Matter:
Using Connected Data for Better Machine Learning
Alicia Frame, PhD
Director of Product Management, Neo4j
Stuart Laurie
Senior Solutions Architect, Neo4j
2. Neo4j, Inc. All rights reserved 2021
20 of the top 25 financial firms
7 of the top 10 retailers
7 of the top 10 software vendors
Neo4j: The Graph Company
Neo4j is the creator of:
• The world’s leading graph database
• The first graph data science platform
• The most flexible graph data model
• The easiest-to-use graph query language
Thousands of Organizations Use Neo4j
2
Silicon Valley
London
Munich
Paris
Malmö
3. Neo4j, Inc. All rights reserved 2021
3
Node
Represents an entity in the graph
Relationship
Connect nodes to each other
Property
Describes a node or relationship:
e.g. name, age, weight etc
Wait, what’s a graph?
MICA
ANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@dan”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Since:
Jan 10, 2011
LOVES
LOVES
LOVES
LIVES WITH
O
W
N
S
D
R
I
V
E
S
4. Neo4j, Inc. All rights reserved 2021
Networks of People Transaction Networks
Bought
B
ou
gh
t
V
i
e
w
e
d
R
e
t
u
r
n
e
d
Bought
Knowledge Networks
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
Risk management,
Supply chain, Orders,
Payments, etc.
Employees, Customers,
Suppliers, Partners,
Influencers, etc.
Enterprise content,
Domain specific content,
eCommerce content, etc
K
n
o
w
s
Knows
Knows
K
n
o
w
s
4
Everything is Naturally Connected
5. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
5
Relationships
are the strongest
predictors of behavior
But You Can’t Analyse
What You Can’t See
● Most data science techniques
ignore relationships
● It’s painful to manually engineer
connected features from tabular
data
● Graphs are built on
relationships, so…
● You don’t have to guess at the
correlations: with graphs,
relationships are built in
James Fowler
6. Neo4j, Inc. All rights reserved 2021
6
6 Top 10 Tech Trends in Data and Analytics, 16 Feb 2021
According to Garner, “Graphs form
the foundation of modern D&A,
with capabilities to enhance and
improve user collaboration, ML models
and explainable AI.
The recent Gartner AI in Organizations
Survey demonstrates that graph
techniques are increasingly
prevalent as AI maturity grows,
going from 13% adoption when AI
maturity is lowest to 48% when
maturity is highest.”
AI Research Papers
Featuring Graph
Source: Dimensions Knowledge System
4x
Increase in
traffic to
Neo4j GDS
page in
2H-2020
Analytics & Data Science Interest
Exploding in Neo4j Community
+4.8m
Views on
the graph
algorithms
short video
+193k
downloads
7. Neo4j, Inc. All rights reserved 2021
7
Queries
Find the patterns you know exist.
Machine Learning
Uncover trends and make
predictions
Visualization
Explore, collaborate, and explain
Graphs & Data Science
Analytics
Feature
Engineering
Data
Exploration
Graph
Data
Science
Queries
Machine Learning Visualization
8. Neo4j, Inc. All rights reserved 2021
8
Graphs & Data Science
Knowledge Graphs
Graph Algorithms
Graph Native
Machine Learning
Find the patterns you’re
looking for in connected data
Use unsupervised machine
learning techniques to
identify associations,
anomalies, and trends.
Use embeddings to learn the
features in your graph that
you don’t even know are
important yet.
Train in-graph supervise ML
models to predict links,
labels, and missing data.
9. Neo4j, Inc. All rights reserved 2021
Neo4j’s Graph Data Science Framework
Neo4j Graph Data
Science Library
Neo4j
Database
Neo4j
Bloom
Scalable Graph Algorithms &
Analytics Workspace
Native Graph Creation &
Persistence
Visual Graph
Exploration & Prototyping
10. Neo4j, Inc. All rights reserved 2021
Robust Graph Algorithms & ML methods
● Compute metrics about the topology and connectivity
● Build predictive models to enhance your graph
● Highly parallelized and scale to 10’s of billions of nodes
10
The Neo4j GDS Library
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
Efficient & Flexible Analytics Workspace
● Automatically reshapes transactional graphs into
an in-memory analytics graph
● Optimized for global traversals and aggregation
● Create workflows and layer algorithms
● Store and manage predictive models in the
model catalog
11. Neo4j, Inc. All rights reserved 2021
11
55+ Graph Data Science Techniques in Neo4j
Pathfinding &
Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• A* Shortest Path
• Yen’s K Shortest Path
• Minimum Weight Spanning Tree
• K-Spanning Tree (MST)
• Random Walk
• Breadth & Depth First Search
Centrality &
Importance
• Degree Centrality
• Closeness Centrality
• Harmonic Centrality
• Betweenness Centrality & Approx.
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Hyperlink Induced Topic Search (HITS)
• Influence Maximization (Greedy, CELF)
Community
Detection
• Triangle Count
• Local Clustering Coefficient
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity
• K-1 Coloring
• Modularity Optimization
• Speaker Listener Label Propagation
Supervised
Machine Learning
• Node Classification
• Link Prediction
… and more!
Heuristic Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Similarity
• Node Similarity
• K-Nearest Neighbors (KNN)
• Jaccard Similarity
• Cosine Similarity
• Pearson Similarity
• Euclidean Distance
• Approximate Nearest Neighbors (ANN)
Graph
Embeddings
• Node2Vec
• FastRP
• FastRPExtended
• GraphSAGE
• Synthetic Graph Generation
• Scale Properties
• Collapse Paths
• One Hot Encoding
• Split Relationships
• Graph Export
• Pregel API (write your own algos)
12. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
12
What’s New?
13. Neo4j, Inc. All rights reserved 2021
13
GDS 1.6: GA May 27
Compatible with Neo4j 4.x series:
• Seven algorithms graduated
to the fully supported product
tier
• All ML models now support
model persistence
• Major improvements to our
embeddings
• New capabilities like graph
filtering, scalers, and more
● Article Rank, Eigenvector
Centrality, Degree Centrality
● Pathfinding
Product tier
algos
● Project subgraphs based on
existing properties in the
in-memory graph
Subgraph
Projections
● Up to 3 models in CE
● Model persistence for node
classification, link prediction
Machine
Learning
● Improvements to node
classification, link prediction
● Scaling & normalization
ML maturity
● New algorithms for influence
maximization thanks to
@xkitsios
Community
Contributions
14. Neo4j, Inc. All rights reserved 2021
Machine Learning Improvements
Community Edition users now have up to 3 trained models 🎉
…. But that’s not all:
• We’d added gds.alpha.scaleProperties
, supporting min-max, max, mean,
log, standard score, L1 and L2 Norm scaling for properties
• NodeClassification and LinkPrediction now support stream and write
modes, and their models can be saved, published and restored
• Node2Vec has been promoted to the beta tier - significantly faster,
supports weights, seeding, and mutate mode
15. Neo4j, Inc. All rights reserved 2021
Subgraph Projections
You can now create a new in-memory graph by filtering based on properties in
your existing one with gds.beta.graph.subgraph:
• Use native projections and subset your graph,
instead of using expensive cypher projections
• Pre-process your data for faster execution, for
example calculating degree centrality and removing
high/low degree nodes, or running WCC and
creating graphs for each component
• Chain algorithms together by filtering on
properties, like running Louvain and then
calculating nodeSimilarity for each community
node.class = 1
Degree > 1
Louvain Community ID = 4
16. Neo4j, Inc. All rights reserved 2021
Influence Maximization Algorithms
Finding the nodes in a graph that can trigger
cascading changes:
• Who do I market to, to drive the most adoption?
• Which blogs should I read to get news first?
• Who should you test to get early warning of an outbreak?
… or: Given a network with n nodes and given a “spreading” or propagation process
on that network, choose a “seed set” s, of size k<n to maximize the number of nodes
in the network that are ultimately influenced
17. Neo4j, Inc. All rights reserved 2021
Influence Maximization Algorithms
Finding the nodes in a graph that can trigger
cascading changes:
• Who do I market to, to drive the most adoption?
• Which blogs should I read to get news first?
• Who should you test to get early warning of an outbreak?
This is a combinatorial optimization problem - computationally complex!
● Greedy method: polynomial time approximation
● CELF method: faster than greedy on realistic network sizes and structures
18. Neo4j, Inc. All rights reserved 2021
Influence Maximization Algorithms
Finding the nodes in a graph that can trigger
cascading changes:
• Who do I market to, to drive the most adoption?
• Which blogs should I read to get news first?
• Who should you test to get early warning of an outbreak?
These algorithms were contributed by community member @xkitsios 💕
19. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Real World Use Cases
19
20. Neo4j, Inc. All rights reserved 2021
20
Accelerate Innovation using Neo4j Graph Data Science
From Simple to Highly Sophisticated Data Science
Uranus is the third
biggest planet
R&D: Better health
outcomes through
machine learning on
patient journeys
Disambiguation
with graph
algorithms at scale
Analytics to improve reliability
by predicting problems in a
supply-chain knowledge graph
Analysis Repeatability
Analysis
Complexity
Full Production
Simple, Ad Hoc
High
Analytics
Data Science
21. Neo4j, Inc. All rights reserved 2021
21
• Challenge: Difficulty finding at-fault
components via ad hoc analytics on a
vertically integrated supply chain
• Solution: Uses a knowledge graph to model
and analyze their complex products
• Results:
○ Quickly pinpoint root causes of
problems
○ Reduced query times from two
minutes to seconds
○ Anti-recommendation using graph
algorithms to identify and eliminate
bad combinations of components
Boston Scientific
Finding At-Fault Components
22. Neo4j, Inc. All rights reserved 2021
22
• Challenge: It’s hard to make
recommendations to anonymous users
• Solution: Connect first and third party
cookies using graph algorithms to create
unique profiles
• Results:
○ Converted 14B anonymous data
points into 163M user profiles
○ Drove 612% increase in web
traffic
Meredith Corp
Identifying the Anonymous
23. Neo4j, Inc. All rights reserved 2021
23
AstraZeneca
Patient Journey
“We used graph algorithms to find
patients that had specific journey
types and patterns and then find
others that are close and similar.”
Joseph Roemer
Global Commercial IT Insight & Analytics Sr. Director
AstraZeneca
● Challenge: How to best intervene sooner for
complex diseases that develop over years
● Solution: Neo4j knowledge graph of 3 yrs of
visits, tests, & diagnosis with 10’s Bn of
records. Using graph algorithms and
machine learning together.
● Results:
○ Identified journey archetypes and
patterns using graph feature
engineering as input to ML
○ Revealed journey similarities over
time with community detection
○ Found influential touch-points in the
journey using graph algorithms
24. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Demo
24
25. Neo4j, Inc. All rights reserved 2021
25
Graph-Native ML Workflows inside Neo4j
Graph-Native
Feature
Engineering
Train
Predictive
Model
Queries
Algorithms
Embeddings
1. Model Type
2. Property
Selection
3. Train & Test
4. Model
Selection
Apply Model to
Existing / New
Data
Use Predictions
for Decisions
Use Predictions
to Enhance
the Graph
Publish & Share
Store Model in
Database
26. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
26
Resources
Graph Resources
● Video: Advantages of Graph Technology
● Whitepaper: AI & Graph Technology: Enhancing AI with Context &
Connections
● Whitepaper: Financial Fraud Detection with Graph Data Science
● Case Study: Meredith Corporation
Neo4j BookShelf
● Graph Databases For Dummies
● Graph Data Science For Dummies
● O’Reilly Graph Algorithms
27. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
27
Resources
Get Started
● Sandbox: https://neo4j.com/sandbox/
● Guides: neo4j.com/developer/graph-data-science/
● GitHub: github.com/neo4j/graph-data-science
28. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
28
Thank you!
Contact us at
sales@neo4j.com