This document provides an overview and introduction to graphs for artificial intelligence and machine learning. It discusses definitions of ML and AI and how graphs can be used in both. It describes the graph data model and how graph algorithms like path finding, centrality measures, and clustering can be applied. Contemporary graph ML techniques are summarized, like graph convolutional neural networks and using graphs for structured causal models. The document argues that graphs are a powerful structure for ML that allow smarter data processing and more effective models.
The amount of data available to us is growing rapidly, but what is required to make useful conclusions out of it?
Outline
1. Different tactics to gather your data
2. Cleansing, scrubbing, correcting your data
3. Running analysis for your data
4. Bring your data to live with visualizations
5. Publishing your data for rest of us as linked open data
Expectation Maximization (EM) algorithm is a method that is used for finding maximum likelihood or maximum a posteriori (MAP) that is the estimation of parameters in statistical models, and the model depends on unobserved latent variables that is calculated using models. Copy the link given below and paste it in new browser window to get more information on Em Algorithm:- http://www.transtutors.com/homework-help/statistics/em-algorithm.aspx
These slides are for the tutorial on how to use R language for data analysis and Machine Learning tasks.
The workshop was given at OSCON (Austin, TX), 2017
Many powerful Machine Learning algorithms are based on graphs, e.g., Page Rank (Pregel), Recommendation Engines (collaborative filtering), text summarization, and other NLP tasks. Also, the recent developments with Graph Neural Networks connect the worlds of Graphs and Machine Learning even further.
Considering data pre-processing and feature engineering which are both vital tasks in Machine Learning Pipelines extends this relationship across the entire ecosystem. In this session, we will investigate the entire range of Graphs and Machine Learning with many practical exercises.
The amount of data available to us is growing rapidly, but what is required to make useful conclusions out of it?
Outline
1. Different tactics to gather your data
2. Cleansing, scrubbing, correcting your data
3. Running analysis for your data
4. Bring your data to live with visualizations
5. Publishing your data for rest of us as linked open data
Expectation Maximization (EM) algorithm is a method that is used for finding maximum likelihood or maximum a posteriori (MAP) that is the estimation of parameters in statistical models, and the model depends on unobserved latent variables that is calculated using models. Copy the link given below and paste it in new browser window to get more information on Em Algorithm:- http://www.transtutors.com/homework-help/statistics/em-algorithm.aspx
These slides are for the tutorial on how to use R language for data analysis and Machine Learning tasks.
The workshop was given at OSCON (Austin, TX), 2017
Many powerful Machine Learning algorithms are based on graphs, e.g., Page Rank (Pregel), Recommendation Engines (collaborative filtering), text summarization, and other NLP tasks. Also, the recent developments with Graph Neural Networks connect the worlds of Graphs and Machine Learning even further.
Considering data pre-processing and feature engineering which are both vital tasks in Machine Learning Pipelines extends this relationship across the entire ecosystem. In this session, we will investigate the entire range of Graphs and Machine Learning with many practical exercises.
How To Interview a Data Scientist
Daniel Tunkelang
Presented at the O'Reilly Strata 2013 Conference
Video: https://www.youtube.com/watch?v=gUTuESHKbXI
Interviewing data scientists is hard. The tech press sporadically publishes “best” interview questions that are cringe-worthy.
At LinkedIn, we put a heavy emphasis on the ability to think through the problems we work on. For example, if someone claims expertise in machine learning, we ask them to apply it to one of our recommendation problems. And, when we test coding and algorithmic problem solving, we do it with real problems that we’ve faced in the course of our day jobs. In general, we try as hard as possible to make the interview process representative of actual work.
In this session, I’ll offer general principles and concrete examples of how to interview data scientists. I’ll also touch on the challenges of sourcing and closing top candidates.
This tutorial will provide you with a basic understanding of graph database technology and the ability to quickly begin development of a graph database application. You will have the capability to recognize graph-based problems and present the benefits of using graph technology for problem resolution.
The tutorial will give you an understanding of:
• Graph theory - origins and concepts
• Benefits of graph databases
• Different types of graph databases
• Typical graph database API
• Programming basics
• Use cases
Bring your laptops for a hands-on opportunity to practice some sample codes. A basic understanding of Java programming is a recommended prerequisite to understand this course. This session is led by the InfiniteGraph technical team and the demonstration code will be drawn from InfiniteGraph examples, however the broader educational presentation is product-neutral and not a commercial presentation of their products.
To participate in the hands-on portion of the graph tutorial users must have:
• Java programming experience
• Java Developer Kit (JDK)
• Current InfiniteGraph installed on laptop. (To download visit www.objectivity.com/infinitegraph)
• HelloGraph test – Upon installing IG, run HelloGraph to test the install. (HelloGraph can be found online at http://wiki.infinitegraph.com/2.1/w/index.php?title=Download_Sample_Code)
Leon Guzenda was one of the founding members of Objectivity in 1988 and one of the original architects of Objectivity/DB. He currently works with Objectivity's major customers to help them effectively develop and deploy complex applications and systems that use the industry's highest-performing, most reliable DBMS technology, Objectivity/DB. He also liaises with technology partners and industry groups to help ensure that Objectivity/DB remains at the forefront of database and distributed computing technology. Leon has more than 35 years experience in the software industry. At Automation Technology Products, he managed the development of the ODBMS for the Cimplex solid modeling and numerical control system. Before that, he was Principal Project Director for International Computers Ltd. in the United Kingdom, delivering major projects for NATO and leading multinationals. He was also design and development manager for ICL's 2900 IDMS product. He spent the first 7 years of his career working in defense and government systems. Leon has a B.S. degree in Electronic Engineering from the University of Wales.
Presentation on data preparation with pandasAkshitaKanther
Data preparation is the first step after you get your hands on any kind of dataset. This is the step when you pre-process raw data into a form that can be easily and accurately analyzed. Proper data preparation allows for efficient analysis - it can eliminate errors and inaccuracies that could have occurred during the data gathering process and can thus help in removing some bias resulting from poor data quality. Therefore a lot of an analyst's time is spent on this vital step.
This developer-focused webinar will explain how to use the Cypher graph query language. Cypher, a query language designed specifically for graphs, allows for expressing complex graph patterns using simple ASCII art-like notation and offers a simple but expressive approach for working with graph data.
During this webinar you'll learn:
-Basic Cypher syntax
-How to construct graph patterns using Cypher
-Querying existing data
-Data import with Cypher
-Using aggregations such as statistical functions
-Extending the power of Cypher using procedures and functions
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...Neo4j
In this talk we'll explore powerful analytic techniques for graph data. Firstly we'll discover some of the innate properties of (social) graphs from fields like anthropology and sociology. By understanding the forces and tensions within the graph structure and applying some graph theory, we'll be able to predict how the graph will evolve over time. To test just how powerful and accurate graph theory is, we'll also be able to (retrospectively) predict World War 1 based on a social graph and a few simple mechanical rules.
Then we'll see how graph matching can be used to extract online business intelligence (for powerful retail recommendations). In turn we'll apply these powerful techniques to modelling domains in Neo4j (a graph database) and show how Neo4j can be used to drive business intelligence.
Don't worry, there won't be much maths :-)
How To Interview a Data Scientist
Daniel Tunkelang
Presented at the O'Reilly Strata 2013 Conference
Video: https://www.youtube.com/watch?v=gUTuESHKbXI
Interviewing data scientists is hard. The tech press sporadically publishes “best” interview questions that are cringe-worthy.
At LinkedIn, we put a heavy emphasis on the ability to think through the problems we work on. For example, if someone claims expertise in machine learning, we ask them to apply it to one of our recommendation problems. And, when we test coding and algorithmic problem solving, we do it with real problems that we’ve faced in the course of our day jobs. In general, we try as hard as possible to make the interview process representative of actual work.
In this session, I’ll offer general principles and concrete examples of how to interview data scientists. I’ll also touch on the challenges of sourcing and closing top candidates.
This tutorial will provide you with a basic understanding of graph database technology and the ability to quickly begin development of a graph database application. You will have the capability to recognize graph-based problems and present the benefits of using graph technology for problem resolution.
The tutorial will give you an understanding of:
• Graph theory - origins and concepts
• Benefits of graph databases
• Different types of graph databases
• Typical graph database API
• Programming basics
• Use cases
Bring your laptops for a hands-on opportunity to practice some sample codes. A basic understanding of Java programming is a recommended prerequisite to understand this course. This session is led by the InfiniteGraph technical team and the demonstration code will be drawn from InfiniteGraph examples, however the broader educational presentation is product-neutral and not a commercial presentation of their products.
To participate in the hands-on portion of the graph tutorial users must have:
• Java programming experience
• Java Developer Kit (JDK)
• Current InfiniteGraph installed on laptop. (To download visit www.objectivity.com/infinitegraph)
• HelloGraph test – Upon installing IG, run HelloGraph to test the install. (HelloGraph can be found online at http://wiki.infinitegraph.com/2.1/w/index.php?title=Download_Sample_Code)
Leon Guzenda was one of the founding members of Objectivity in 1988 and one of the original architects of Objectivity/DB. He currently works with Objectivity's major customers to help them effectively develop and deploy complex applications and systems that use the industry's highest-performing, most reliable DBMS technology, Objectivity/DB. He also liaises with technology partners and industry groups to help ensure that Objectivity/DB remains at the forefront of database and distributed computing technology. Leon has more than 35 years experience in the software industry. At Automation Technology Products, he managed the development of the ODBMS for the Cimplex solid modeling and numerical control system. Before that, he was Principal Project Director for International Computers Ltd. in the United Kingdom, delivering major projects for NATO and leading multinationals. He was also design and development manager for ICL's 2900 IDMS product. He spent the first 7 years of his career working in defense and government systems. Leon has a B.S. degree in Electronic Engineering from the University of Wales.
Presentation on data preparation with pandasAkshitaKanther
Data preparation is the first step after you get your hands on any kind of dataset. This is the step when you pre-process raw data into a form that can be easily and accurately analyzed. Proper data preparation allows for efficient analysis - it can eliminate errors and inaccuracies that could have occurred during the data gathering process and can thus help in removing some bias resulting from poor data quality. Therefore a lot of an analyst's time is spent on this vital step.
This developer-focused webinar will explain how to use the Cypher graph query language. Cypher, a query language designed specifically for graphs, allows for expressing complex graph patterns using simple ASCII art-like notation and offers a simple but expressive approach for working with graph data.
During this webinar you'll learn:
-Basic Cypher syntax
-How to construct graph patterns using Cypher
-Querying existing data
-Data import with Cypher
-Using aggregations such as statistical functions
-Extending the power of Cypher using procedures and functions
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...Neo4j
In this talk we'll explore powerful analytic techniques for graph data. Firstly we'll discover some of the innate properties of (social) graphs from fields like anthropology and sociology. By understanding the forces and tensions within the graph structure and applying some graph theory, we'll be able to predict how the graph will evolve over time. To test just how powerful and accurate graph theory is, we'll also be able to (retrospectively) predict World War 1 based on a social graph and a few simple mechanical rules.
Then we'll see how graph matching can be used to extract online business intelligence (for powerful retail recommendations). In turn we'll apply these powerful techniques to modelling domains in Neo4j (a graph database) and show how Neo4j can be used to drive business intelligence.
Don't worry, there won't be much maths :-)
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
Serverless Toronto's 6th-anniversary event helps IT pros understand and prepare for the #GenAI tsunami ahead. You'll gain situational awareness of the LLM Landscape, receive condensed insights, and actionable advice about RAG in 2024 from Google AI Lead Mark Ryan and LlamaIndex creator Jerry Liu. We chose #RAG (Retrieval-Augmented Generation) because it is the predominant paradigm for building #LLM (Large Language Model) applications in enterprises today - and that's where the jobs will be shifting. Here is the recording: https://youtu.be/P5xd1ZjD-Os?si=iq8xibj5pJsJ62oW
During this Big Data Warehousing Meetup, Caserta Concepts and Databricks addressed the number one operational and analytic goal of nearly every organization today – to have complete view of every customer. Customer Data Integration (CDI) must be implemented to cleanse and match customer identities within and across various data systems. CDI has been a long-standing data engineering challenge, not just one of logic and complexity but also of performance and scalability.
The speakers brought together best practice techniques with Apache Spark to achieve complete CDI.
Speakers:
Joe Caserta, President, Caserta Concepts
Kevin Rasmussen, Big Data Engineer, Caserta Concepts
Vida Ha, Lead Solutions Engineer, Databricks
The sessions covered a series of problems that are adequately solved with Apache Spark, as well as those that are require additional technologies to implement correctly. Topics included:
· Building an end-to-end CDI pipeline in Apache Spark
· What works, what doesn’t, and how do we use Spark we evolve
· Innovation with Spark including methods for customer matching from statistical patterns, geolocation, and behavior
· Using Pyspark and Python’s rich module ecosystem for data cleansing and standardization matching
· Using GraphX for matching and scalable clustering
· Analyzing large data files with Spark
· Using Spark for ETL on large datasets
· Applying Machine Learning & Data Science to large datasets
· Connecting BI/Visualization tools to Apache Spark to analyze large datasets internally
The speakers also touched on data governance, on-boarding new data rapidly, how to balance rapid agility and time to market with critical decision support and customer interaction. They also shared examples of problems that Apache Spark is not optimized for.
For more information on the services offered by Caserta Concepts, visit our website: http://casertaconcepts.com/
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
Relationships are highly predictive of behavior, yet most data science models overlook this information because it's difficult to extract network structure for use in machine learning (ML).
With graphs, relationships are embedded in the data itself, making it practical to add these predictive capabilities to your existing practices.
That’s why we’re presenting and demoing the use of graph-native ML to make breakthrough predictions. This will cover:
- Different approaches to graph feature engineering, from queries and algorithms to embeddings
- How ML techniques leverage everything from classical network science to deep learning and graph convolutional neural networks
- How to generate representations of your graph using graph embeddings, create ML models for link prediction or node classification, and apply these models to add missing information to an existing graph/incoming data
- Why no-code visualization and prototyping is important
How Graph Databases used in Police Department?Samet KILICTAS
This presentation delivers basics of graph concept and graph databases to audience. It clearly explains how graph databases are used with sample use cases from industry and how it can be used for police departments. Questions like "When to use a graph DB?" and "Should I solve a problem with Graph DB?" are answered.
Improve ml predictions using graph algorithms (webinar july 23_19).pptxNeo4j
Graph enhancements to AI and ML are changing the landscape of intelligent applications. In this webinar, we’ll focus on using graph feature engineering to improve the accuracy, precision, and recall of machine learning models. You’ll learn how graph algorithms can provide more predictive features as well as aid in feature selection to reduce overfitting. We’ll illustrate a link prediction workflow using Spark and Neo4j to predict collaboration and discuss our missteps and tips to get to measurable improvements.
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesBigML, Inc
DutchMLSchool. Logistic Regression, Deepnets, and Time Series (Supervised Learning II) - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016)
Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
In diesem Webinar wollen wir einen Überblick über unser Angebot für Data Scientsts geben und zeigen, was heute schon relativ einfach und schnell möglich ist.
Atelier - Architecture d’applications de Graphes - GraphSummit ParisNeo4j
Atelier - Architecture d’applications de Graphes
Participez à cet atelier pratique animé par des experts de Neo4j qui vous guideront pour découvrir l’intelligence contextuelle. En utilisant un jeu de données réel, nous construirons étape par étape une solution de graphes ; de la construction du modèle de données de graphes à l’exécution de requêtes et à la visualisation des données. L’approche sera applicable à de multiples cas d’usages et industries.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...Neo4j
Romain CAMPOURCY – Architecte Solution, Sopra Steria
Patrick MEYER – Architecte IA Groupe, Sopra Steria
La Génération de Récupération Augmentée (RAG) permet la réponse à des questions d’utilisateur sur un domaine métier à l’aide de grands modèles de langage. Cette technique fonctionne correctement lorsque la documentation est simple mais trouve des limitations dès que les sources sont complexes. Au travers d’un projet que nous avons réalisé, nous vous présenterons l’approche GraphRAG, une nouvelle approche qui utilise une base Neo4j générée pour améliorer la compréhension des documents et la synthèse d’informations. Cette méthode surpasse l’approche RAG en fournissant des réponses plus holistiques et précises.
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...Neo4j
Charles Gouwy, Business Product Leader, Adeo Services (Groupe Leroy Merlin)
Alors que leur Knowledge Graph est déjà intégré sur l’ensemble des expériences d’achat de leur plateforme e-commerce depuis plus de 3 ans, nous verrons quelles sont les nouvelles opportunités et challenges qui s’ouvrent encore à eux grâce à leur utilisation d’une base de donnée de graphes et l’émergence de l’IA.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GraphAware - Transforming policing with graph-based intelligence analysisNeo4j
Petr Matuska, Sales & Sales Engineering Lead, GraphAware
Western Australia Police Force’s adoption of Neo4j and the GraphAware Hume graph analytics platform marks a significant advancement in data-driven policing. Facing the challenges of growing volumes of valuable data scattered in disconnected silos, the organisation successfully implemented Neo4j database and Hume, consolidating data from various sources into a dynamic knowledge graph. The result was a connected view of intelligence, making it easier for analysts to solve crime faster. The partnership between Neo4j and GraphAware in this project demonstrates the transformative impact of graph technology on law enforcement’s ability to leverage growing volumes of valuable data to prevent crime and protect communities.
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
David Pond, Lead Product Manager, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Shirley Bacso, Data Architect, Ingka Digital
“Linked Metadata by Design” represents the integration of the outcomes from human collaboration, starting from the design phase of data product development. This knowledge is captured in the Data Knowledge Graph. It not only enables data products to be robust and compliant but also well-understood and effectively utilized.
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
Delivered by Michael Down at Gartner Data & Analytics Summit London 2024 - Your enemies use GenAI too: Staying ahead of fraud with Neo4j.
Fraudsters exploit the latest technologies like generative AI to stay undetected. Static applications can’t adapt quickly enough. Learn why you should build flexible fraud detection apps on Neo4j’s native graph database combined with advanced data science algorithms. Uncover complex fraud patterns in real-time and shut down schemes before they cause damage.
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxNeo4j
Delivered by Sreenath Gopalakrishna, Director of Software Engineering at BT, and Dr Jim Webber, Chief Scientist at Neo4j, at Gartner Data & Analytics Summit London 2024 this presentation examines how knowledge graphs and GenAI combine in real-world solutions.
BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Future innovation plans include the exploration of uses of EKG + Generative AI.
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit MilanNeo4j
Look beyond the hype and unlock practical techniques to responsibly activate intelligence across your organization’s data with GenAI. Explore how to use knowledge graphs to increase accuracy, transparency, and explainability within generative AI systems. You’ll depart with hands-on experience combining relationships and LLMs for increased domain-specific context and enhanced reasoning.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Graphs for Ai and ML
1. Graphs for AI and ML
Dr. Jim Webber
Chief Scientist, Neo4j
@jimwebber
2. ● Some definitions
● Accidental Skynet
● Graph theory
● Contemporary graph ML
● The future of graph AI
Overview
3. ● ML - Machine Learning
○ Finding functions from historical data to guide future
interactions within a given domain
● AI - Artificial Intelligence
● The property of a system that it appears intelligent to its users
● Often, but not always, using ML techniques
● Or ML implementations that can be cheaply retrained to address
neighbouring domains
A Bluffer’s Guide to AI-cronyms
4. ● Predictive analytics
● Use past data to predict the future
● General purpose AI
● ML with transfer learning such that learned experiences in one
domain can be applied elsewhere
● Human-like AI
Often conflated with
7. Extract all the features!
• What do we do? Turn it to
vectors and pump it through a
classification or regression
model
• That’s actually not a bad
thing
• But we can do so much before
we even get to ML…
• … if we have graph data
10. • Nodes with optional properties and optional labels
• Named, directed relationships with optional properties
• Relationships have exactly one start and end node
• Which may be the same node
Labeled Property graph model
11. Fearless querying
MATCH path = (:author {name:’Jim Webber’}
-[*]->(:character {name:’The Doctor’})
RETURN path
OR
MATCH (me:author {name:’Jim Webber’},
(doc:character {name:’The Doctor’}),
path = shortestPath((me)-[*]->(doc))
RETURN path
18. Toolkit matures into
proper database
• Cypher and Neo4j server make
real time graph analytical
patterns simple to apply
• Amazing and humane to
implement
37. Graph Theory
• Rich knowledge of how graphs
operate in many domains
• Off the shelf algorithms to
process those graphs for
information, insight, predictions
• Low barrier to entry
• Amazingly powerful
56. It if a node has strong relationships to two neighbours, then these
neighbours must have at least a weak relationship between them.
[Wikipedia]
Strong Triadic Closure
59. • Relationships can have “strength” as well as intent
• Think: weighting on a relationship in a property graph
• Weak links play another super-important structural role in graph
theory
• They bridge neighbourhoods
Weak relationships
61. “If a node A in a network satisfies the Strong Triadic Closure Property
and is involved in at least two strong relationships, then any local
bridge it is involved in must be a weak relationship.”
[Easley and Kleinberg]
Local Bridge Property
63. • (NP) Hard problem
• Repeatedly remove the spanning links between dense regions
• Or recursively merge nodes into ever larger “subgraph” nodes
• Choose your algorithm carefully – some are better than others for
a given domain
• Can use to (almost exactly) predict the
break up of the karate club!
Graph Partitioning
70. Find and stop spammers
Extract graph structure over time
Not message content!
(Fakhraei et al, KDD 2015)
Learning to stop bad guys
Result: find and classify 70% spammers with 90% accuracy
71. Much of modern graph ML is still about turning graphs to vectors
Graph2Vec and friends
Highly complementary techniques
Mixing structural data and features gives better results
Better data into the model, better results out
But we don’t have to always vectorize graphs...
Graph ML
72. Knowledge Graphs
• Semantic domain knowledge for
inference and understanding
• E.g. eBay Google Assistant
• What’s the next best question to ask
when a potential customer says they
want a bag?
• Price? Function? Colour?
• Depends on context! Demographic,
history, user journey.
• Richly connected data makes the
system seem intelligent
• But it’s “just” data and algorithms in
reality
73. Graph Convolutional
Neural Networks
A general architecture for
predicting node and relationship
attributes in graphs.
(Kipf and Welling, ICLR 2017)
Credit: Andrew Docherty (CSIRO), YowData 2017
https://www.youtube.com/watch?v=Gmxz41L70Fg
74. Graph Networks for
Structured Causal Models
• Position paper from Google,
MIT, Edinburgh
• Structured representations and
computations (graphs) are key
• Goal: generalize beyond direct
experience
• Like human infants can
https://arxiv.org/pdf/1806.01261.pdf
ML - this is what nerds do. Sometimes ML is so compelling that it seems intelligent, but in reality it’s data and algorithms
AI - train a system to classify animals, might also work on shoes. See: hot dog; not hot dog!
GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
Here’s where we are mostly today. Row-oriented data.
Maybe some documents, maybe some columns, but mostly rows of data from arcane data models.
You already know graphs
People talk about Codd’s relational model being mature because it was proposed in 1969 – 49 years old.
Euler’s graph theory was proposed in 1736 – 282 years old.
Now we use the labelled property graph model. A very simple set of idioms that can build very sophisticated models.
Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before.
Nodes are a bit like documents, but they’re flat at present in Neo4j.
You pour data into your nodes and then connect them – easy peasy.
This enables high fidelity domain modeling because this is how your domains work.
And you don’t have to do this stuff in your application code – it’s right there in the database
Let’s prove it by exploring a fun domain…
Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before.
Nodes are a bit like documents, but they’re flat at present in Neo4j.
You pour data into your nodes and then connect them – easy peasy.
This enables high fidelity domain modeling because this is how your domains work.
And you don’t have to do this stuff in your application code – it’s right there in the database
Let’s prove it by exploring a fun domain…
If you want to know who followed Matt Smith, easy!
Traversing the regenerated (or any) relationship takes about 1/40 millionth of a second on this mac in a steady state database
What if you want to know who preceded Matt Smith?
Easy. Traverse the regenerated rels in the other way.
Cost? About 1/40 millionth of a second on this laptop in a steady state database.
Joins are super cheap for good graph DBs
On my laptop, I can get to 40M traversals/sec in a steady state DB
You can explore a lot of data very quickly
Which makes it a good fit for data intensive applications like ML
My shortest path to Doctor Who?
But before we get to ML, let’s take a step back into my history building smart systems
All the way back to Autumn 2008
November 2007 met Emil at Øredev in Malmö Sweden
Java and Maven build-your-own-DBMS toolkit called Neo4j
Java Core API only
Long afternoon of loading data and writing a recommendation query...
Find the current customer
Find things they own
Find things that depend on the things they own
Sell
Repeat
All we did at first was understand the dependencies between products and bundles.
We never tried to upsell something incompatible. Never tried to sell them something they already owned. Never undersold them.
And it opened a world of possibilities to combine other graphs: demographic, social, geographical, municipal, network...
The system made intelligent suggestions, but it was not ML or AI, just graph queries. It was good.
Unexpectedly Powerful
Solved a problem in a long afternoon was meant to take years with off-the-shelf software
Applied same pattern to PoS retail recommendations, fraud detection… in subsequent months
Still amazed!
Effect: join Neo4j as Chief Scientist in 2010.
So let’s get into graphs.
Realtime retail recommendations.
Historical anecdote about beer and nappies.
Large UK retailer
We had a data model
Some of it taxonomical
Some of it stock-centric.
Some transactional
The insight here is that we have a typical young father who buys beer, nappies and a game console simply by reducing subgraph
We have a pattern to search for
We knew it was young fathers, but I bet your model would classify them as lazy, drunken, gamers right?
Now we look for young fathers – implied by beer and nappies purchases – who haven’t bought a game console.
Turn it to text. And…
Neo4j 2.0:
MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType)
WHERE
n.name = "nappies" AND
b.name = "Beer" AND
x.name = "Xbox" AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND
NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x))
RETURN u
Neo4j 2.0:
MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType)
WHERE
n.name = "nappies" AND
b.name = "Beer" AND
x.name = "Xbox" AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND
NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x))
RETURN u
Neo4j 2.0:
MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType)
WHERE
n.name = "nappies" AND
b.name = "Beer" AND
x.name = "Xbox" AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND
NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x))
RETURN u
Neo4j 2.0:
MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType)
WHERE
n.name = "nappies" AND
b.name = "Beer" AND
x.name = "Xbox" AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND
(u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND
NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x))
RETURN u
This is fast: query latency is proportional to the amount of graph searched
Now called “network science”
First we need to talk about some local properties
A triadic closure is a local property of (social) graphs whereby if two nodes are connected via a path involving a third node, there is an increased likelihood that the two nodes will become directly connected in future.
This is a familiar enough situation for us in a social setting whereby if we happen to be friends with two people, ultimately there's an increased chance that those people will become direct friends too, since by being our friend in the first place, it's an indication of social similarity and suitability.
It’s called triadic closure, because we try to close the triangle.
We see this all the time – it’s likely that if we have two friends, that they will also become at least acquaintances and potentially friends themselves!
In general, if a node A has relationships to B & C then the relationship between B&C is likely to form – especially if the existing relationships are both strong.
This is an incredibly strong assertion and will not be typically upheld by all subgraphs in a graph. Nonetheless it is sufficiently commonplace (particularly in social networks) to be trusted as a predictive aid.
Sentiment plays a role in how closures form too – there is a notion of balance.
From a triadic closure perspective this is OK, but intuitively it seems odd.
Cartman’s friends shouldn’t be friends with his enemies. Nor should Cartman’s enemies be friends with his friends.
This makes sense – Cartman’s friend Craig is also an enemy of Cartman’s enemy Tweek
Two negative sentiments and one positive sentiment is a balanced structure – and it makes sense too since we gang up with our friends on our poor beleaguered enemy
Is this true?
Yes.
Is it nice?
No.
Is it realistic?
Oh yes.
Another balanced – and more pleasant – arrangement is for three positive sentiments, in this case mutual friends.
A starting point for a network of friends and enemies 100 years on from the armistice
Red links indicate enemy of relationship
Black links indicate friend of relationship
The Three Emperor’s league
Italy forms the with Austria and Germany – a balanced +++ triadic closure
If Italy had made only a single alliance (or enemy) it would have been unstable and another relationship would be likely to form anyway!
Triple Alliance
Russia becomes hostile to Austria and Germany – a balance --+ d triadic closure
becomes agnostic towards France.
German-Russian Lapse
The French and Russians ally, forming a balanced --+ triadic closure with the UK
French-Russian Alliance
The UK and France enter into the famous
Entente Cordiale
This produces an unbalanced ++- triadic closure with Russia, and the graph doesn’t like it.
The British and Russians form an alliance, thereby changing their previously unbalanced triadic closure into a balanced one.
Other local pressures on the graph make other closures form.
Italy becomes hostile to Russia, forming a balanced --+ closure with the France, and another balanced --+ closure with the UK.
Germany and the UK become hostile forming a balanced --+ closure with Austria and another balanced --+ closure with Italy
British-Russian Alliance
That WWI can be predicted without domain knowledge by iterating a graph and applying local structural constraints is nothing short of astonishing to me.
Note how the network slides into a balanced labeling — and into World War I.
A very surprising result: graphs don’t know about human conflicts.
In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman.
Given a starting graph, we can apply this simple local principal to see how it would evolve.
In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman.
Given a starting graph, we can apply this simple local principal to see how it would evolve.
A local bridge acts as a link – perhaps the only realistic link - between two otherwise distant (or separate) subgraphs.
Local bridges are semantically rich – they provide conduits for information flow between otherwise independent groups.
In this case DATING is a local bridge – it must also be a weak relationship according to our definition of a local bridge
Intuitively this makes sense – your girl/boyfriend is rather less important at age 8 than your regular friends, IIRC.
How do we identify local bridges?
Any weak link which would cause a component of the graph to become disconnected.
Being able to identify local bridges is important – in this case it’s the only know conduit to allow the girls and boys to communicate.
In real life local bridges are apparent in your organisation as experts (or managers); appear as nexus in fraud cases;
Zachary in the Journal of Anthropological Research 1977
Intuitively we can see “clumps” in this graph.
But how do we separate them out? It’s called minimum cut.
What’s interesting is that it’s mechanical – no domain knowledge is necessary.
There’s only one failure with the method Zachary chose to partition the graph: node 9 should have gone to the instructor’s club but instead went with the original president of the club (node 34).
Why? Because the student was three weeks away from completing a four-year quest to obtain a black belt, which he could only do with the instructor (node 1)
Other minimum cut approaches might deliver slightly different results, but on the whole it’s amazing you get such insight from an algorithm!
But is there enough information in the graph itself to predict the schism?
But is there enough information in the graph itself to predict the schism?
Actually neo4j already has a bunch of these algorithms.
Call them easily from Cypher
Emergent intelligence from the graph!
Efficiency for graph operations is paramount.
You don’t need huge macho clusters to do this.
Large payment provider, transaction history
A 300M node, ~18B rel graph pageranked with 20 iterations in less than 2 hours using the graph algos.
On commodity hardware.
Contemporary AI
Graph structure itself is rich.
In this example we don’t need to know the content of the messages to know they’re spam at high confidence, just their position in the graph.
Mine a vector of graph features, feed it into the trained model.
Graphs have a key advantage: structural context. Where is the node in the graph? Who are its neighbours? Etc.
That richness feeds into the model and makes it better, more accurate, more dependable.
PageRank, Degree, Neighbourhood, Colour, etc are all features that improve your ML outcomes but are only available from graphs.
ICLR = International Conference on Learning Representations
Graph of movies that a user liked.
Feed into neural net
Graph of users who rated one of those movies.
Feed into neural net.
Recurse through the data until you get to all the movies and all the users which are just embedding vectors (fancy hashes that place like near like in a vector space).
[Can change these vectors for features to avoid cold-starts, without changing overall architecture.]
Graph of back-propagated trained neural nets.
Incremental: Scalable for both training and prediction.
Extensible: bring in other graph layers!
Better than collaborative filtering because it can work on any graph, not just bipartite user-likes-movies graphs. E.g. User likes actor in movies with genre – much richer!
A bipartite graph, also called a bigraph, is a set of graph vertices decomposed into two disjoint sets such that no two graph vertices within the same set are adjacent. I.e. Users don’t connect to users, only to movies.
This is already happening - it’s YouTube’s recommender algorithm.
A growing realisation from leaders in the AI community: graph networks as the foundational building block for human-like AI.
Argue: combinatorial generalization must be a top priority for AI to achieve human-like abilities. Must be able to compose a finite set of elements in infinite ways (eg like language)
We draw analogies by aligning the relational structure between two domains and drawing inferences about one based on corresponding knowledge about the other (Gentner and Markman, 1997; Hummel and Holyoak, 2003). Hierarchies are critical.
Inductive bias: how the algorithm prioritises solutions.
Relational inductive biases to guide deep learning about entities, relations, and rules for composing them. I.e. the learning understands graphs
All this might seem hard at first – we’re used to tables, and our toolkits expect them.
Graphs changes this for the better. Once you get graphs, all the other things seem hard
“a vast gap between human and machine intelligence remains, especially with respect to efficient, generalizable learning”
70% of graph ML today is still turning graphs to vectors
E.g. deep walk - random walk through graph, assign vector node when encountered based on neighborhood
30% is truly graph AI - “differential neural computer” -> discern patterns that users can’t; write sophisticated algorithms (fraud, shortest path, etc) from incentive declarations.
E.g. no longer need a human expert to discover the “young father” pattern in our data, the machine learns it’s a valuable query in some contexts.
So enjoy using graphs for AI, but please remember graphs for good!