I gave this presentation at DataOps 19 in Barcelona.
You will find information about Neo4j and how to use it with Graph Algorithms for Machine Learning and Artificial Intelligence.
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
Relationships are highly predictive of behavior, yet most data science models overlook this information because it's difficult to extract network structure for use in machine learning (ML).
With graphs, relationships are embedded in the data itself, making it practical to add these predictive capabilities to your existing practices.
That’s why we’re presenting and demoing the use of graph-native ML to make breakthrough predictions. This will cover:
- Different approaches to graph feature engineering, from queries and algorithms to embeddings
- How ML techniques leverage everything from classical network science to deep learning and graph convolutional neural networks
- How to generate representations of your graph using graph embeddings, create ML models for link prediction or node classification, and apply these models to add missing information to an existing graph/incoming data
- Why no-code visualization and prototyping is important
A comparison of relational and graph model theories, with an eye towards DataStax's implementation of Graph. Note: I'm working on a concise, formal mathematical definition of relational, based on Codd's 1970 paper. (Thanks to Artem Chebotko for suggesting this.)
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
Relationships are highly predictive of behavior, yet most data science models overlook this information because it's difficult to extract network structure for use in machine learning (ML).
With graphs, relationships are embedded in the data itself, making it practical to add these predictive capabilities to your existing practices.
That’s why we’re presenting and demoing the use of graph-native ML to make breakthrough predictions. This will cover:
- Different approaches to graph feature engineering, from queries and algorithms to embeddings
- How ML techniques leverage everything from classical network science to deep learning and graph convolutional neural networks
- How to generate representations of your graph using graph embeddings, create ML models for link prediction or node classification, and apply these models to add missing information to an existing graph/incoming data
- Why no-code visualization and prototyping is important
A comparison of relational and graph model theories, with an eye towards DataStax's implementation of Graph. Note: I'm working on a concise, formal mathematical definition of relational, based on Codd's 1970 paper. (Thanks to Artem Chebotko for suggesting this.)
Software Analytics with Jupyter, Pandas, jQAssistant, and Neo4j [Neo4j Online...Markus Harrer
Let’s tackle problems in software development in an automated, data-driven and reproducible way!
As developers, we often feel that there might be something wrong with the way we develop software. Unfortunately, a gut feeling alone isn’t sufficient for the complex, interconnected problems in software systems.
We need solid, understandable arguments to gain budgets for improvement projects or to defend us against political decisions. Though, we can help ourselves: Every step in the development or use of software leaves valuable, digital traces. With clever analysis, these data can show us root causes of problems in our software and deliver new insights – understandable for everybody.
If concrete problems and their impact are known, developers and managers can create solutions and take sustainable actions aligned to existing business goals.
In this meetup, I talk about the analysis of software data by using a digital notebook approach. This allows you to express your gut feelings explicitly with the help of hypotheses, explorations and visualizations step by step.
I show the collaboration of open source analysis tools (Jupyter, Pandas, jQAssistant and, of course, Neo4j) to inspect problems in Java applications and their environment. We have a look at performance hotspots, knowledge loss and worthless code parts – completely automated from raw data up to visualizations for management.
Participants learn how they can translate their unsafe gut feelings into solid evidence for obtaining budgets for dedicated improvement projects with the help of data analysis.
The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j
Alexandru Iosup, Full Professor, Vrije Universiteit Amsterdam (VU Amsterdam)
Angela Bonifati, Full Professor of Computer Science, Université de Lyon
Hannes Voigt, Software Engineer, Neo4j
Neo4j Aura on AWS: The Customer Choice for Graph DatabasesNeo4j
Neo4j, the leading enterprise graph platform, is now globally available on Amazon Web Services (AWS) as a fully managed, always-on database service.
Neo4j Aura Enterprise on AWS empowers organizations to rapidly build mission-critical, intelligent cloud-based applications backed by the performance, scale, security, and reliability that only the most deployed and most trusted graph technology can provide.
Customers like Levi Strauss & Co., Sainsbury’s, Siemens, The Orchard and Tourism Media are already using Aura Enterprise on AWS for fraud detection, regulatory compliance, recommendation engines, supply chain analysis, and much more.
Join us for this exclusive digital event to learn more about Neo4j Aura Enterprise on AWS:
- Understand the state of the data and analytics market and how investing in Neo4j and AWS fits in the big picture
- Get insights into how Siemens and Tourism Media are unlocking the power of graph databases on AWS during a panel discussion
- Discover how to build modern graph applications with Neo4j on AWS through a step-by-step presentation and demo
With the introduction of the Neo4j Graph Platform and increased adoption of graph database technology across all industries, now is a better time than ever to get started with graphs.
Join us for this introduction to Neo4j and graph databases. We'll discuss the primary use cases for graph databases and explore the properties of Neo4j that make those use cases possible.
Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.
Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.
We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.
Webinar, 23 July 2020
Software Analytics with Jupyter, Pandas, jQAssistant, and Neo4j [Neo4j Online...Markus Harrer
Let’s tackle problems in software development in an automated, data-driven and reproducible way!
As developers, we often feel that there might be something wrong with the way we develop software. Unfortunately, a gut feeling alone isn’t sufficient for the complex, interconnected problems in software systems.
We need solid, understandable arguments to gain budgets for improvement projects or to defend us against political decisions. Though, we can help ourselves: Every step in the development or use of software leaves valuable, digital traces. With clever analysis, these data can show us root causes of problems in our software and deliver new insights – understandable for everybody.
If concrete problems and their impact are known, developers and managers can create solutions and take sustainable actions aligned to existing business goals.
In this meetup, I talk about the analysis of software data by using a digital notebook approach. This allows you to express your gut feelings explicitly with the help of hypotheses, explorations and visualizations step by step.
I show the collaboration of open source analysis tools (Jupyter, Pandas, jQAssistant and, of course, Neo4j) to inspect problems in Java applications and their environment. We have a look at performance hotspots, knowledge loss and worthless code parts – completely automated from raw data up to visualizations for management.
Participants learn how they can translate their unsafe gut feelings into solid evidence for obtaining budgets for dedicated improvement projects with the help of data analysis.
The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j
Alexandru Iosup, Full Professor, Vrije Universiteit Amsterdam (VU Amsterdam)
Angela Bonifati, Full Professor of Computer Science, Université de Lyon
Hannes Voigt, Software Engineer, Neo4j
Neo4j Aura on AWS: The Customer Choice for Graph DatabasesNeo4j
Neo4j, the leading enterprise graph platform, is now globally available on Amazon Web Services (AWS) as a fully managed, always-on database service.
Neo4j Aura Enterprise on AWS empowers organizations to rapidly build mission-critical, intelligent cloud-based applications backed by the performance, scale, security, and reliability that only the most deployed and most trusted graph technology can provide.
Customers like Levi Strauss & Co., Sainsbury’s, Siemens, The Orchard and Tourism Media are already using Aura Enterprise on AWS for fraud detection, regulatory compliance, recommendation engines, supply chain analysis, and much more.
Join us for this exclusive digital event to learn more about Neo4j Aura Enterprise on AWS:
- Understand the state of the data and analytics market and how investing in Neo4j and AWS fits in the big picture
- Get insights into how Siemens and Tourism Media are unlocking the power of graph databases on AWS during a panel discussion
- Discover how to build modern graph applications with Neo4j on AWS through a step-by-step presentation and demo
With the introduction of the Neo4j Graph Platform and increased adoption of graph database technology across all industries, now is a better time than ever to get started with graphs.
Join us for this introduction to Neo4j and graph databases. We'll discuss the primary use cases for graph databases and explore the properties of Neo4j that make those use cases possible.
Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.
Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.
We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.
Webinar, 23 July 2020
Detecting eCommerce Fraud with Neo4j and LinkuriousNeo4j
Last year, the global eCommerce market represented $1.9 trillions. As the market expands worldwide, the opportunity for fraud keeps growing with fraudsters constantly refining their tactics to outsmart anti-fraud frameworks. From chargeback fraud to re-shipping scam or identity fraud, numerous types of fraud can impact your organization. While collecting data is essential to enable real-time risk assessment, many organizations don’t have the necessary tools to find the insights needed to block fraud attempts.
Neo4j and Linkurious offer a solution to tackle the eCommerce fraud challenge. Their combined technologies provide a 360° overview of organization’s data and allow real-time analysis and detection of eCommerce fraud patterns and activities.
In this webinar, you will learn about:
- The current trends of eCommerce frauds and the risks for organizations;
- The challenges of detecting fraud tentatives in real-time and the advantage of the graph approach;
- How to use Linkurious’ graph visualization and analysis software to prevent and investigate eCommerce fraud.
Digital Transformation and the Journey to a Highly Connected EnterpriseNeo4j
Jeff Morris, Head of Product Marketing at Neo4j, covers the rise of connections in data and why a forward thinking enterprise must embrace the connections in their data in order to survive.
Jeff Morris, Head of Product Marketing at Neo4j, discusses the path to the highly connected enterprise, and why connections are crucial to staying ahead of the competition.
A webinar on how Neo4j customers like Nasa, AirBnB, eBay, government agencies, investigative journalists and others are building Knowledge Graphs to inform today and tomorrow’s solutions.
Digital Transformation and Innovation on http://denreymer.com
- Merging the Real World and the Virtual World
- Intelligence Everywhere
- The New IT Reality Emerges
http://www.gartner.com//it/content/2940400/2940420/january_15_top_10_technology_trends_2015_dcearley.pdf
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
Graph analysis employs powerful algorithms to explore and discover relationships in social network, IoT, big data, and complex transaction data. Learn how graph technologies are used in applications such as fraud detection for banking, customer 360, public safety, and manufacturing. This session will provide an overview and demos of graph technologies for Oracle Cloud Services, Oracle Database, NoSQL, Spark and Hadoop, including PGX analytics and PGQL property graph query language.
Presented at Analytics and Data Summit, March 20, 2018
A Connections-first Approach to Supply Chain OptimizationNeo4j
Supply chain optimization is an unusual balancing act that requires finesse, skill and timely data. Every supply chain’s the key questions to be answered are:
What to Buy? -- what are the factors in determining your optimal product mix and set of suppliers.
How much to Buy? -- what are the most and least popular items at any given time interval
When to Buy? -- long lags in delivery timing may tax limit your flexibility and influence your inventory management practices.
We will illustrate an API-based solution that utilizes a Graph database platform to add demonstrable value to Supply Planning.
This is the presentation at the OSIsoft EMEA User Conference in London, 16 October 2017.
Please note that "Open Edge Module" and "FogLAMP" are synonyms.
Time Series From Collection To AnalysisIvan Zoratti
This is my talk at Percona Live 2016 in Santa Clara. It is a quick walkthrough time series workloads and solutions with traditional relational databases and dedicated time series DBs
This is the presentation at Percona Live 2015 on MySQL, MariaDB and Percona Orchestration on bare metal, virtualised environments and clouds (AWS and OpenStack).
These are the slides that I presented at Percona Live London, 4th Dec 2012.
There is lots of content related to the deployment and use of MySQL in the cloud, specifically in Amazon EC2.
These are the slides of my presentation at the NYC MySQL Meetup on Sep 21 2012. There are tips and tricks about MySQL in the cloud and the SkySQL cloud data suite
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
3. Neo4j - The Graph Company
500+
7/10
12/25
8/10
53K+
100+
250+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
• Creator of the Neo4j Graph Platform
• ~200 employees
• HQ in Silicon Valley, other offices
include London, Munich, Paris and
Malmö (Sweden)
• $160M in funding from Morgan Stanley,
Fidelity and others.
• Over 10M+ downloads,
• 250+ enterprise subscription customers
with over half with >$1B in revenue
Ecosystem
Startups in program
Enterprise customers
Partners
Meetup members
Events per year
Industry’s Largest Dedicated Investment in Graphs
7. What Is A Graph?
• Nodes (vertices)
• Relationships (links, edges)
• Properties
• Labels
8. Neo4j — Changing the World
ICIJ used Neo4j to uncover the world’s
largest journalistic leak to date, The
Panama Papers, exposing criminals,
corruption and extensive tax evasion.
The US space agency uses Neo4j for
their “Lessons Learned” database to
connect information to improve search
ability effectiveness in space mission.
eBay uses Neo4j to enable
machine learning through
knowledge graphs powering
“conversational commerce”.
Knowledge Graph for AIFraud Detection Knowledge Graph for humans
9. The world is a graph – everything is connected
• people, places, events
• companies, markets
• countries, history, politics
• sciences, art, teaching
• technology, networks, machines,
applications, users
• software, code, dependencies,
architecture, deployments
• criminals, fraudsters and their behavior
10. • Nodes
• Represent the objects in the graph
• Can be labeled
Property Graph Model Components
Car
Person Person
11. • Nodes
• Represent the objects in the graph
• Can be labeled
• Relationships
• Relate nodes by type and direction
Property Graph Model Components
Car
DRIVES
OW
NS
LOVES
Person
LOVES
LIVES WITH
Person
12. • Nodes
• Represent the objects in the graph
• Can be labeled
• Relationships
• Relate nodes by type and direction
• Properties
• Name-value pairs that can go on nodes
and relationships.
Property Graph Model Components
Car
DRIVES
OW
NS
LOVES
Person
LOVES
LIVES WITH
Person
brand: “Mini”
model: “Cooper”
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2015
13. ● Fraud Detection
● Anti Money Laundering (AML), e-commerce Fraud,
First-Party Bank Fraud, Insurance Fraud, Link
Analysis
● Real-time analysis of data relationships is essential
to uncovering fraud rings and other sophisticated
scams before fraudsters and criminals cause lasting
damage.
● https://neo4j.com/use-cases/fraud-detection
Use Cases
14. ● Fraud Detection
● Master Data Management
● 360-Degree View of Customer, Cross Reference
Business Objects, Data Ownership, Master Data,
Organizational Hierarchies
● Organize and manage your master data with the
flexible and schema-free graph database model in
order to get real-time insights and a 360° view of
your customers.
● https://neo4j.com/use-cases/master-data-manage
ment
Use Cases
15. ● Fraud Detection
● Master Data Management
● Recommendation Engine
● Content & Media Recommendations, Graph-Aided
Search Engine, Product Recommendations,
Professional Networks, Social Recommendations
● Graph-powered recommendation engines help
companies personalize products, content and
services by leveraging a multitude of connections
in real time.
● https://neo4j.com/use-cases/real-time-recommend
ation-engine
Use Cases
16. ● Fraud Detection
● Master Data Management
● Recommendation Engine
● Knowledge Graph
● Asset Management, Cataloging, Content
Management, Inventory, Workflow Processes
● Tap into the power of graph-based search tools for
better digital asset management using the most
flexible and scalable solution on the market.
● https://neo4j.com/use-cases/knowledge-graph
Use Cases
17. ● Fraud Detection
● Master Data Management
● Recommendation Engine
● Knowledge Graph
● Network and Database
Infrastructure Monitoring
● Asset Management, Cybersecurity, Impact Analysis,
Quality-of-Service Mapping, Root Cause Analysis
● Graph databases are inherently more suitable than
RDBMS for making sense of complex
interdependencies central to managing networks
and IT infrastructure.
● https://neo4j.com/use-cases/network-and-it-opera
tions
Use Cases
18. ● Fraud Detection
● Master Data Management
● Recommendation Engine
● Knowledge Graph
● Network and Database
Infrastructure Monitoring
● Social Media and Social Network Graphs
● Community Cluster Analysis, Friend-of-Friend
Recommendations, Influencer Analysis, Sharing &
Collaboration, Social Recommendations
● Easily leverage social connections or infer
relationships based on activity when you use a
graph database to power your social network
application.
● https://neo4j.com/use-cases/social-network
Use Cases
19. ● Fraud Detection
● Master Data Management
● Recommendation Engine
● Knowledge Graph
● Network and Database
Infrastructure Monitoring
● Social Media and Social Network
Graphs
● Artificial Intelligence and
Machine Learning
● Artificial Intelligence (AI) is poised to drive the next wave of
technological disruption across nearly every industry. Just like
previous technology revolutions in web and mobile, however, there
will be winners and losers based on who harnesses this technology
for a true competitive advantage.
● https://neo4j.com/use-cases/artificial-intelligence
Use Cases
20. Neo4j Is a Database
No Size
Limit
Binary &
HTTP
Protocol
ACID
Transactions
2-4 M
ops/s
per core
Clustering
Scale & HA
Official
Drivers
Neo4j
RELIABILITY
PERFORMANCE
SCALABILITY
AVAILABILITY
INTEGRATION
22. Native Graph Storage
At Write Time:
data is connected
as it is stored
“We keep the connection lines alive.”
At Read Time:
Lightning-fast retrieval of data and
relationships via pointer chasing
Graph Value is found in the
Traversals and Hops
Index-free adjacency
24. The Raft Consensus Algorithm
Equivalent to Paxos in fault-tolerance and performance.
Causal Clustering
https://raft.github.io/
25. • Node property existence
• Relationship property existence
• Unique property
• Node and combined properties
uniqueness
Schema-free or Schema-based
ACTED_IN
roles: [“Zachry”]
name: Tom Hanks
born: 1956
Person Actor
name: Hugo Weaving
born: 1960
Person Actor
title: Cloud Atlas
released: 2012
Movie
ACTED_IN
roles: [“Bill Smoke”]
title: The Matrix
released: 1999
Movie
ACTED_IN
roles: [“Agent Smith”]
name: Lana Wachowski
born: 1965
Person Director
DIRECTED
DIRECTED
26. Ann
Cypher Query Language
CREATE (:Person { name:"Dan"} ) -[:LOVES]-> (:Person { name:"Ann"} )
LOVES
Dan
NODE
LABEL PROPERTY
Relationship NODE
LABEL PROPERTY
27. Cypher Query Language
MATCH (:Person { name:"Dan"} ) -[:LOVES]-> ( whom )
RETURN whom
NODE Relationship NODE
?
LOVES
Dan
33. Graph Analytics
Query (e.g. Cypher/Python)
Real-time, local decisioning
and pattern matching
Graph Algorithms Libraries
Global analysis
and iterations
You know what you’re
looking for and making a
decision
You’re learning the overall structure
of a network, updating data, and
predicting
Local
Patterns
Global
Computation
37. Bridge Points Languages
Telecom Network
Source: “Fast unfolding of communities in large networks” – Blondel, Guillaume, Lambiotte, Lefebvre - https://arxiv.org/pdf/0803.0476.pdf
38. Centrality
● PageRank
● ArticleRank
● Betweenness Centrality
● Closeness Centrality
● Harmonic Centrality
● Eigenvector Centrality
● Degree Centrality
Community Detection
● Louvain
● Label Propagation
● Connected Components
● Strongly Connected Components
● Triangle Counting / Clustering Coefficient
● Balanced Triads
Similarity
● Jaccard Similarity
● Cosine Similarity
● Pearson Similarity
● Euclidean Distance
● Overlap Similarity
Graph Algorithms
https://neo4j.com/docs/graph-algorithms
https://neo4j.com/graph-algorithms-book
Path Finding
● Minimum Weight Spanning Tree
● Shortest Path
● Single Source Shortest Path
● All Pairs Shortest Path
● A*
● Yen’s K-shortest paths
● Random Walk
Link Prediction
● Adamic Adar
● Common Neighbors
● Preferential Attachment
● Resource Allocation
● Same Community
● Total Neighbors
39. Pathfinding & Search
• Single-Source Shortest Path
○ Calculates “shortest” path between a
node and all other nodes
• All-Pairs Shortest Path
○ Finds all shortest paths between
all nodes
43. Similarity Algorithms
Evaluates how alike nodes are at an individual
level
Properties or attributes
•Cosine Similarity Recommendations (Movies): https://neo4j.com/graphgist/movie-recommendations-with-k-nearest-neighbors-and-cosine-similarity
•Social similarities (Interests): https://medium.com/neo4j/cosine-similarity-in-neo4j-d617b0442439
44. Community Detection Algorithms
Evaluates how a group is clustered or partitioned
Different approaches to define a community
•Label Propagation Prediction Drug-Drug Interaction: https://neo4j.com/blog/graph-algorithms-neo4j-label-propagation
•Twitter Polarity Classification: https://dl.acm.org/citation.cfm?id=2140465
45. Link Prediction
Can we infer which new interactions
are likely to occur in the future?
“We formalize this question as the link
prediction problem, and develop
approaches to link prediction based on
measures for analyzing the
“proximity” of nodes in a network.”
Jon Kleinberg and David Liben-Nowell A Goal, an Approach
&
an Algorithm Category
46. What can we use this approach for?
● future associations in a terrorist network
● co-authorships in a citation network
● associations between molecules in a biology network
● interest in an artist or artwork
47. Predicting a link means that we are predicting some future behaviour
or an unobserved fact.
For example, in a citation network, we’re actually predicting the action
of two people collaborating on a paper.
What's common across all these use cases?
48. Based on number of potential
triangles / closing triangles
Concept is that if 2 strangers have
a friend/colleague in common,
they are more likely to be
introduced
Common Neighbours
51. Source: “Communities, modules and large-scale structure in networks“ - Mark Newman
Source: “Hierarchical structure and the prediction of missing links in networks”;
”Structure and inference in annotated networks” - A. Clauset, C. Moore, and M.E.J. Newman.
Graph Algorithms
Extract Structure and Infer Behavior
52. Centralities
• PageRank
○ Which nodes have the most overall influence
• Closeness
○ Which nodes are able to reach entire group the fastest
• Betweenness
○ Which nodes are the bridges between different clusters
(most shortest paths)
• Degree
○ The number of connections in/out of a node
53. Centralities
• PageRank
○ Which nodes have the most overall influence
• Closeness
○ Which nodes are able to reach entire group the fastest
• Betweenness
○ Which nodes are the bridges between different clusters
(most shortest paths)
• Degree
○ The number of connections in/out of a node
54. Source: Maven 7
Centralities
• PageRank
○ Which nodes have the most overall influence
• Closeness
○ Which nodes are able to reach entire group the fastest
• Betweenness
○ Which nodes are the bridges between different clusters
(most shortest paths)
• Degree
○ The number of connections in/out of a node