This document provides an overview of GraphDB and Neo4j. It discusses why graphs are useful for modeling connected data and common use cases. It also summarizes Neo4j's transactional graph database capabilities, performance advantages, and deployment options. Key topics covered include causal clustering, query planning, and driver and tooling support for developers.
How Graph Databases efficiently store, manage and query connected data at s...jexp
Graph Databases try to make it easy for developers to leverage huge amounts of connected information for everything from routing to recommendations. Doing that poses a number of challenges on the implementation side. In this talk we want to look at the different storage, query and consistency approaches that are used behind the scenes. We’ll check out current and future solutions used in Neo4j and other graph databases for addressing global consistency, query and storage optimization, indexing and more and see which papers and research database developers take inspirations from.
GraphQL - The new "Lingua Franca" for API-Developmentjexp
Three years ago, with the release of the GraphQL specification, Facebook took a fresh stab at the topic of "API design between remote services and applications." The key aspects of GraphQL provide a common, schema-based, domain-specific language and flexible, dynamic queries at interface boundaries.
In the talk, I'd like to compare GraphQL and REST and showcase benefits for developers and architects using a concrete example in application and API development, data source and system integration.
This developer-focused webinar will explain how to use the Cypher graph query language. Cypher, a query language designed specifically for graphs, allows for expressing complex graph patterns using simple ASCII art-like notation and offers a simple but expressive approach for working with graph data.
During this webinar you'll learn:
-Basic Cypher syntax
-How to construct graph patterns using Cypher
-Querying existing data
-Data import with Cypher
-Using aggregations such as statistical functions
-Extending the power of Cypher using procedures and functions
During this presentation, Will covers the updates made in the Neo4j 3.0 release. He introduces Bolt (Neo4j's new binary protocol), and shows how developers can start using the Neo4j official drivers, build a stored procedure and take advantage of advanced support for cloud, container and on-premise.
How Graph Databases efficiently store, manage and query connected data at s...jexp
Graph Databases try to make it easy for developers to leverage huge amounts of connected information for everything from routing to recommendations. Doing that poses a number of challenges on the implementation side. In this talk we want to look at the different storage, query and consistency approaches that are used behind the scenes. We’ll check out current and future solutions used in Neo4j and other graph databases for addressing global consistency, query and storage optimization, indexing and more and see which papers and research database developers take inspirations from.
GraphQL - The new "Lingua Franca" for API-Developmentjexp
Three years ago, with the release of the GraphQL specification, Facebook took a fresh stab at the topic of "API design between remote services and applications." The key aspects of GraphQL provide a common, schema-based, domain-specific language and flexible, dynamic queries at interface boundaries.
In the talk, I'd like to compare GraphQL and REST and showcase benefits for developers and architects using a concrete example in application and API development, data source and system integration.
This developer-focused webinar will explain how to use the Cypher graph query language. Cypher, a query language designed specifically for graphs, allows for expressing complex graph patterns using simple ASCII art-like notation and offers a simple but expressive approach for working with graph data.
During this webinar you'll learn:
-Basic Cypher syntax
-How to construct graph patterns using Cypher
-Querying existing data
-Data import with Cypher
-Using aggregations such as statistical functions
-Extending the power of Cypher using procedures and functions
During this presentation, Will covers the updates made in the Neo4j 3.0 release. He introduces Bolt (Neo4j's new binary protocol), and shows how developers can start using the Neo4j official drivers, build a stored procedure and take advantage of advanced support for cloud, container and on-premise.
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j
Neo4j-Databridge is a fully-featured ETL tool specifically built for Neo4j, and designed for usability, expressive power and high performance. It has been created to help solve the most common problems faced by large enterprises when importing data into Neo4j - data locality, multiple data sources and formats, performance when loading very large data sets, bespoke data conversions, inclusion of non-tabular data, filtering, merging and de-duplication...
In this webinar, we’ll take a quick tour of the main features of Neo4j-Databridge and understand how it can to help to solve these problems and facilitate importing your data easily and quickly into Neo4j.
Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But, oftentimes with RDBMS, performance degrades with the increasing number and levels of data relationships and data size.
A graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL.
This webinar explains why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.
Complex hierarchical relationships between entities can only be mapped with difficulty in a relational database and demanding queries are usually quite slow.
Graph databases are optimized for exactly these kinds of relationships and can provide high-performance results even with huge amounts of data. Moreover, not only the entities that are stored in the database, have attributes, but also their relationships. Queries can look at entities as well as their relationships.
Get to know the basics of graph databases, using Neo4j as an example, and see how it is used C# projects.
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
Graph analysis employs powerful algorithms to explore and discover relationships in social network, IoT, big data, and complex transaction data. Learn how graph technologies are used in applications such as fraud detection for banking, customer 360, public safety, and manufacturing. This session will provide an overview and demos of graph technologies for Oracle Cloud Services, Oracle Database, NoSQL, Spark and Hadoop, including PGX analytics and PGQL property graph query language.
Presented at Analytics and Data Summit, March 20, 2018
Family tree of data – provenance and neo4jM. David Allen
Discusses data provenance and how it can be implemented in neo4j, as well as many lessons learned about the relative strengths and weaknesses of relational and graph databases.
A Connections-first Approach to Supply Chain OptimizationNeo4j
Supply chain optimization is an unusual balancing act that requires finesse, skill and timely data. Every supply chain’s the key questions to be answered are:
What to Buy? -- what are the factors in determining your optimal product mix and set of suppliers.
How much to Buy? -- what are the most and least popular items at any given time interval
When to Buy? -- long lags in delivery timing may tax limit your flexibility and influence your inventory management practices.
We will illustrate an API-based solution that utilizes a Graph database platform to add demonstrable value to Supply Planning.
Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But RDBMS cannot model or store data and its relationships without complexity, which means performance degrades with the increasing number and levels of data relationships and data size. Additionally, new types of data and data relationships require schema redesign that increases time to market.
A graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL. Join this webinar to learn why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.
Ryan Boyd, Developer Relations at Neo4j
Ryan is a SF-based software engineer focused on helping developers understand the power of graph databases. Previously he was a product manager for architectural software, built applications and web hosting environments for higher education, and worked in developer relations for twenty products during his 8 years at Google. He enjoys cycling, sailing, skydiving, and many other adventures when not in front of his computer.
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j
Neo4j-Databridge is a fully-featured ETL tool specifically built for Neo4j, and designed for usability, expressive power and high performance. It has been created to help solve the most common problems faced by large enterprises when importing data into Neo4j - data locality, multiple data sources and formats, performance when loading very large data sets, bespoke data conversions, inclusion of non-tabular data, filtering, merging and de-duplication...
In this webinar, we’ll take a quick tour of the main features of Neo4j-Databridge and understand how it can to help to solve these problems and facilitate importing your data easily and quickly into Neo4j.
Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But, oftentimes with RDBMS, performance degrades with the increasing number and levels of data relationships and data size.
A graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL.
This webinar explains why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.
Complex hierarchical relationships between entities can only be mapped with difficulty in a relational database and demanding queries are usually quite slow.
Graph databases are optimized for exactly these kinds of relationships and can provide high-performance results even with huge amounts of data. Moreover, not only the entities that are stored in the database, have attributes, but also their relationships. Queries can look at entities as well as their relationships.
Get to know the basics of graph databases, using Neo4j as an example, and see how it is used C# projects.
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
Graph analysis employs powerful algorithms to explore and discover relationships in social network, IoT, big data, and complex transaction data. Learn how graph technologies are used in applications such as fraud detection for banking, customer 360, public safety, and manufacturing. This session will provide an overview and demos of graph technologies for Oracle Cloud Services, Oracle Database, NoSQL, Spark and Hadoop, including PGX analytics and PGQL property graph query language.
Presented at Analytics and Data Summit, March 20, 2018
Family tree of data – provenance and neo4jM. David Allen
Discusses data provenance and how it can be implemented in neo4j, as well as many lessons learned about the relative strengths and weaknesses of relational and graph databases.
A Connections-first Approach to Supply Chain OptimizationNeo4j
Supply chain optimization is an unusual balancing act that requires finesse, skill and timely data. Every supply chain’s the key questions to be answered are:
What to Buy? -- what are the factors in determining your optimal product mix and set of suppliers.
How much to Buy? -- what are the most and least popular items at any given time interval
When to Buy? -- long lags in delivery timing may tax limit your flexibility and influence your inventory management practices.
We will illustrate an API-based solution that utilizes a Graph database platform to add demonstrable value to Supply Planning.
Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But RDBMS cannot model or store data and its relationships without complexity, which means performance degrades with the increasing number and levels of data relationships and data size. Additionally, new types of data and data relationships require schema redesign that increases time to market.
A graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL. Join this webinar to learn why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.
Ryan Boyd, Developer Relations at Neo4j
Ryan is a SF-based software engineer focused on helping developers understand the power of graph databases. Previously he was a product manager for architectural software, built applications and web hosting environments for higher education, and worked in developer relations for twenty products during his 8 years at Google. He enjoys cycling, sailing, skydiving, and many other adventures when not in front of his computer.
An introduction to Neo4j and Graph Databases. Learn about the primary use cases for Graph Databases and explore the properties of Neo4j that make those use cases possible.
Beyond Big Data: Leverage Large-Scale ConnectionsNeo4j
Today’s CIOs and CTOs don’t just need to manage larger volumes of data – they need to generate insight from their existing data. In this case, the relationships between data points matter more than the individual points themselves. In order to leverage data relationships, organizations need a database technology that stores relationship information as a first-class entity. That technology is a graph database.
Attend this webinar to hear about:
1. Why graph technologies are essential for the future of increasingly connected data
2. How enterprises such Walmart, eBay, and UBS are using Neo4j’s native-graph platform for a diverse set of use cases, including security & fraud detection, real-time recommendation engines, master data and many more
3. And how Neo4j on IBM POWER8 can scale your massive graph data with real-time graph processing that’s entirely in-memory.
Data is both our most valuable asset and our biggest ongoing challenge. As data grows in volume, variety and complexity, across applications, clouds and siloed systems, traditional ways of working with data no longer work.
Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.
We'll discuss the primary use cases for graph databases
Explore the properties of Neo4j that make those use cases possible
Look into the visualisation of graphs
Introduce how to write queries.
Webinar, 23 July 2020
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
I gave this presentation at DataOps 19 in Barcelona.
You will find information about Neo4j and how to use it with Graph Algorithms for Machine Learning and Artificial Intelligence.
A webinar on how Neo4j customers like Nasa, AirBnB, eBay, government agencies, investigative journalists and others are building Knowledge Graphs to inform today and tomorrow’s solutions.
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4jNeo4j
The Neo4j graph database is the fastest growing database engine in the market and has hundreds of customer references across Europe and globally, solving significant technology problems for large Enterprises in Finance, Telco, Retail, Utilities, Logistics and Internet sectors. Typical use cases are Recommendations, Fraud Detection, MDM, Network and Software Analysis and Optimization, Identity and Access Management.
Your Roadmap for An Enterprise Graph StrategyNeo4j
Speaker: Michael Moore, Ph.D., Executive Director, Knowledge Graphs + AI, EY National Advisory
Abstract: Knowledge graphs have enormous potential for delivering superior customer experiences, advanced analytics and efficient data management.
Learn valuable tips from a leading practitioner on how to position, organize and implement your first enterprise graph project.
Graph Database Use Cases - StampedeCon 2015StampedeCon
Presented by Max De Marzi at StampedeCon 2015: Graphs are eating the world – but in what form? Starting off with a primer on Graph Databases, this talk will focus on practical examples of graph applications.
We’ll look at multiple use cases like job boards, dating sites, recommendation engines of all kinds, network management, scheduling engines, etc. We'll also see some examples of graph search in action.
Looming Marvelous - Virtual Threads in Java Javaland.pdfjexp
Nowadays we have 2 options for concurrency in Java:
* simple, synchronous, blocking code with limited scalability that tracks well linearly at runtime, or.
* complex, asynchronous libraries with high scalability that are harder to handle.
Project Loom aims to bring together the best aspects of these two approaches and make them available to developers.
In the talk, I'll briefly cover the history and challenges of concurrency in Java before we dive into Loom's approaches and do some behind-the-scenes implementation. To manage so many threads reasonably needs some structure - for this there are proposals for "Structured Concurrency" which we will also look at. Some examples and comparisons to test Loom will round up the talk.
Project Loom is included in Java 19 and 20 as a preview feature, it can already be tested how well it works with our applications and libraries.
Spoiler: Pretty good.
Easing the daily grind with the awesome JDK command line toolsjexp
Included in the JDK installation are a lot of handy tools for Java developers, from java, jshell and jcmd to jfr and jdeprscan. These allow you to analyze a running JVM, generate JRE's, run Java source code and much more. In this talk I would like to present a number of these tools with practical examples and thus expand the toolbox of the participants. With the command line tools, many tasks can be automated and executed more efficiently, leaving more time for the exciting things in developer life.
Today, we have 2 options for concurrency in Java:
Simple, synchronous, blocking code with limited scalability that tracks well linearly at runtime, or
complex, asynchronous libraries with high scalability, which are harder to handle
Project Loom aims to bring together the best aspects of these two approaches and make them available to developers.
In the talk, I'll briefly discuss the history and challenges of concurrency in Java before we dive into Loom's approaches and look a bit behind the scenes.
Project Loom is included since Java 17 as a preview feature, it can already be tested to see how well it works with our applications and libraries. Spoiler: Pretty good.
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptxjexp
I was there when Cypher was invented in 2012
and have been using it ever since. The language is
extremely powerful and easy to learn. But to truly
master it, you need to understand how it works
internally and how the database executes your
queries. In this session, you'll learn to look behind
the scenes at execution plans with PROFILE and
EXPLAIN and which specific clauses, expressions,
structures, and operations help you minimize
Cypher and database operations. After this talk,
you should be able to speed up your Cypher
statements quite a bit.
The newly released Neo4j Connector for Apache Spark can be used to read and write data between the two systems.
In this demo I show how to use the investigative Data from the FinCEN files to have a full pipeline up an running.
Notebook is in https://github.com/jexp/fincen
How Graphs Help Investigative Journalists to Connect the Dotsjexp
The Journalists of the ICIJ used graph technology to understand the relationships between the leaked pieces of information in the Panama and Paradise Papers.
NBC News applied graph algorithms to the messages and follower networks of Russian Twitter trolls to gain further insights.
The Trumpworld organizational data correlated with US bills and government contracts offers starting points for further investigations.
New tools like graph databases allow data journalists to understand the intricate networks of the criminal, economic and political world better as those three examples show. Each journalist adding new connections helps others to validate their stories. They say "It's like magic".
Join Michael for a look behind the scenes of graph based data ingestion, analysis and investigation.
We will use the open source graph database Neo4j, data visualization and graph algorithms to read between the lines.
Who doesn't know him, the office hero, who sat in the office late into the evening and repaired production? The fact that perhaps another colleague sat on the sofa at home and had an equal share in this success is unfortunately not so appreciated in most company cultures. But why is that? Because we are not used to working at home? Because we think that you are not so productive at home? Because you have family, garden or other activities at home? Michael has been working for distributed companies for a long time, but has also worked in offices for a long time. He will take you on his journey through different working environments and tell you what worked well for him.
The JVM is already a runtime for many languages. With the optimizing Graal compiler added to Java 11 and the language implementations in Truffle for Ruby, Python, JavaScript, and R it becomes possible to run them natively on the JVM, even exchanging data between them.
Michael Hunger explains the concepts behind Truffle and Graal and uses a practical example to show how you can use Python and JavaScript for “stored procedures” in a JVM-based database.
He demonstrates how to optimize the startup time of your application and container images by precompiling it to machine-code and examines its limits and the difference it makes. But nothing is perfect—Michael discusses the limitations and compares performances for the full picture.
Presentation at OSCON, PDX 2019.
https://conferences.oreilly.com/oscon/oscon-or/public/schedule/detail/76092
Neo4j Graph Streaming Services with Apache Kafkajexp
In this presentation we give an high level overview of the Neo4j-Kafka integration and the Confluent partnership.
Providing change-data-capture and ingestion capabilities as Neo4j Extension and the Kafka Connect Neo4j Sink on Confluent Hub allows you to integrate real-time streaming with graph querying and analytics.
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Libraryjexp
APOC has become the de-facto standard utility library for Neo4j. In this talk, I will demonstrate some of the lesser known but very useful components of APOC that will save you a lot of work. You will also learn how to combine individual functions into powerful constructs to achieve impressive feats
This will be a fast-paced demo/live-coding talk.
Video: https://neo4j.com/graphconnect-2018/session/neo4j-utility-library-apoc-pearls
Unicorn images by TeeTurtle.com (Unstable Unicorns is a fun game & cool t-shirts)
Code we've written once has to be kept readable, maintainable, understandable and extensible for many years. Good code is not self-serving but the foundation for working together.
Refactoring can help you to keep the quality of the relevant parts of our systems high.
The technique is really easy (almost too easy) - improve the naming, structure, and responsibility in small steps that don't change behavior and run your tests after each step.
18 years ago I got hooked on Refactoring when Martin Fowler's first book came out. I've been using it since then on a daily basis on many different projects. Since then a lot has changed, especially with the help of modern IDEs with their automated refactorings and intentions.
Now he asked me to help review the 2nd edition. Our discussions reminded me that each generation of developers should be taught this crucial skill. That's why I want to give an overview of core refactorings and code-smells but also demonstrate the tips and tricks of today's tools that make this task so much easier.
Plus a sneak preview of the upcoming book.
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...jexp
Highlighting the progress in Neo4j 3.3 and 3.4 especially
Neo4j Desktop, Graph Algorithms, NLP, Date-Time, Geospatial, and performance.
Also featuring the new visualization tool Neo4j Bloom.
We recently released the Neo4j graph algorithms library.
You can use these graph algorithms on your connected data to gain new insights more easily within Neo4j. You can use these graph analytics to improve results from your graph data, for example by focusing on particular communities or favoring popular entities.
We developed this library as part of our effort to make it easier to use Neo4j for a wider variety of applications. Many users expressed interest in running graph algorithms directly on Neo4j without having to employ a secondary system.
We also tuned these algorithms to be as efficient as possible in regards to resource utilization as well as streamlined for later management and debugging.
In this session we'll look at some of these graph algorithms and the types of problems that you can use them for in your applications.
Despite the “Graph” in the name, GraphQL is mostly used to query relational databases, object models or APIs. But it is really easy to support GraphQL endpoints from graph databases too. In this talk, I’ll demonstrate how we implemented a GraphQL extension for the Neo4j graph database. It uses the GraphQL schema definition map arbitrary GraphQL queries into single graph queries and runs them against the data in the Graph database. Using directives in the schema, we added some cool features that are transparent to the end user like computed fields and auto-generated mutations and query types. That allows you to create GraphQL APIs of some complexity without writing a single line of code.
I will show how to use the Neo4j-GraphQL extension, by creating an endpoint for the Game of Thrones dataset, and how we then can use our well-known tools (GraphiQL, apollo-client, graphql-cli, voyager) to interact with it.
Despite the “Graph” in the name, GraphQL is mostly used to query relational databases or object models. But it is really well suited to querying graph databases too. In this talk, I’ll demonstrate how I implemented a GraphQL endpoint for the Neo4j graph database and how you would use it in your app.
The world around us is full of connected information. Neo4j was originally developed to solve two complex "network" problems in a document management system, as it was too hard to manage rich connection information efficiently in traditional and new "NOSQL" databases.During this meetup, we will talk about the technology, and about the journey that a couple of technologists from Malmö took. You will learn* how Neo Technology grew from just the three founders in to a global database company with use-cases in every domain imaginable.* how focusing on customer and community feedback allows us to provide a solution for managing connected data to everyone, not just the large internet companies.
Of course we will also introduce the graph model, it's whiteboard friendlyness and how you get started with Neo4j and it's easy and powerful query language Cypher. We'll also compare the graph and relational data model to see how they differ in shape and capabilities. Finally we discuss the foundations that enable Graph databases to provide higher join performance, faster development processes and more inclusive software for all stakeholders. With use-cases from Gaming, Dating and Finance we'll see how to apply the graph capabilities to these domains to realize new functionality or opportunities that were not possible before.
Finally, if there's a question you've always wanted to ask/discuss, we'll have plenty of time for that at the end of Michael's presentation.
Each of the files or classes of a projects source code represents a tree (AST). Looking at dependencies to other classes besides inheritance creates a graph though. Field types and method parameters are also implicit dependencies. Storing this information in a graph database like Neo4j allows for interesting queries and insights. Class-Graph provides that and is available as open-source github project.
In this talk, Michael Hunger is going to shed some light over the new High Availability architecture for the popular Neo4j Graph Database. We are going to look at the different variants of the Paxos protocol, master failover strategies and cluster management state handling. This piece of infrastructure poses non-trivial challenges to distributed consensus-finding, an interesting session for anyone into scalable systems.
Graphs are everywhere. From websites adding social capabilities to Telcos providing personalized customer services, to innovative bioinformatics research, organizations are adopting graph databases as the best way to model and query connected data. If you can whiteboard, you can model your domain in a graph database.
In this session Emil Eifrem provides a close look at the graph model and offers best use cases for effective, cost-efficient data storage and accessibility.
Take Aways: Understand the model of a graph database and how it compares to document and relational databases Understand why graph databases are best suited for the storage, mapping and querying of connected data
Emil's presentation will be followed by a Hands-on Guide to Spring Data Neo4j. Spring Data Neo4j provides straightforward object persistence into the Neo4j graph database. Conceived by Rod Johnson and Neo Technology CEO Emil Eifrem, it is the founding project of the Spring Data effort. The library leverages a tight integration with the Spring Framework and the Spring Data infrastructure. Besides the easy to use object graph mapping it offers the powerful graph manipulation and query capabilities of Neo4j with a convenient API.
The talk introduces the different aspects of Spring Data Neo4j and shows applications in several example domains.
During the session we walk through the creation of a engaging sample application that starts with the setup and annotating the domain objects. We see the usage of Neo4jTemplate and the powerful repository abstraction. After deploying the application to a cloud PaaS we execute some interesting query use-cases on the collected data.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
7. Value from Data Relationships
Common Use Cases
Internal Applications
Master Data Management
Network and
IT Operations
Fraud Detection
Customer-Facing Applications
Real-Time Recommendations
Graph-Based Search
Identity and
Access Management
8. The Rise of Connections in Data
Networks of People Business Processes Knowledge Networks
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
Data connections are increasing as rapidly as data volumes
9. 9
Harnessing Connections Drives Business Value
Enhanced Decision
Making
Hyper
Personalization
Massive Data
Integration
Data Driven Discovery
& Innovation
Product Recommendations
Personalized Health Care
Media and Advertising
Fraud Prevention
Network Analysis
Law Enforcement
Drug Discovery
Intelligence and Crime Detection
Product & Process Innovation
360 view of customer
Compliance
Optimize Operations
Connected Data at the Center
AI & Machine
Learning
Price optimization
Product Recommendations
Resource allocation
Digital Transformation Megatrends
13. Newcomers in the last 3 years
• DSE Graph
• Agens Graph
• IBM Graph
• JanusGraph
• Tibco GraphDB
• Microsoft CosmosDB
• TigerGraph
• MemGraph
• AWS Neptune
• SAP HANA Graph
21. Cancer Research - Candiolo Cancer Institute
“Our application relies on complex
hierarchical data, which required a more
flexible model than the one provided by
the traditional relational database
model,” said Andrea Bertotti, MD
neo4j.com/case-studies/candiolo-cancer-institute-ircc/
22. Graph Databases in Healthcare and Life Sciences
14 Presenters from all around Europe on:
• Genome
• Proteome
• Human Pathway
• Reactome
• SNP
• Drug Discovery
• Metabolic Symbols
• ...
neo4j.com/blog/neo4j-life-sciences-healthcare-workshop-berlin/
27. 30
• Record “Cyber Monday” sales
• About 35M daily transactions
• Each transaction is 3-22 hops
• Queries executed in 4ms or less
• Replaced IBM Websphere commerce
• 300M pricing operations per day
• 10x transaction throughput on half the
hardware compared to Oracle
• Replaced Oracle database
• Large postal service with over 500k
employees
• Neo4j routes 7M+ packages daily at peak,
with peaks of 5,000+ routing operations per
second.
Handling Large Graph Work Loads for Enterprises
Real-time promotion
recommendations
Marriott’s Real-time
Pricing Engine
Handling Package
Routing in Real-Time
35. The Whiteboard Model Is the Physical Model
Eliminates Graph-to-
Relational Mapping
In your data model
Bridge the gap
between business
and IT models
In your application
Greatly reduce need
for application code
36. CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Property Graph Model Components
Nodes
• The objects in the graph
• Can have name-value properties
• Can be labeled
Relationships
• Relate nodes by type and direction
• Can have name-value properties
LOVES
LOVES
LIVES WITH
PERSON PERSON
37. Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} )
LOVES
Dan Ann
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
38. Relational Versus Graph Models
Relational Model Graph Model
KNOWS
ANDREAS
TOBIAS
MICA
DELIA
Person FriendPerson-Friend
ANDREAS
DELIA
TOBIAS
MICA
51. You all know SQL
SELECT distinct c.CompanyName
FROM customers AS c
JOIN orders AS o
ON (c.CustomerID = o.CustomerID)
JOIN order_details AS od
ON (o.OrderID = od.OrderID)
JOIN products AS p
ON (od.ProductID = p.ProductID)
WHERE p.ProductName = 'Chocolat'
55. Basic Pattern: Customers Orders?
MATCH (:Customer {custName:"Delicatessen"} ) -[:ORDERED]-> (order:Order) RETURN order
VAR LABEL
NODE NODE
LABEL PROPERTY
ORDERED
Customer OrderOrder
REL
56. Basic Query: Customer's Orders?
MATCH (c:Customer)-[:ORDERED]->(order)
WHERE c.customerName = 'Delicatessen'
RETURN *
57. Basic Query: Customer's Frequent Purchases?
MATCH (c:Customer)-[:ORDERED]->
()-[:INCLUDES]->(p:Product)
WHERE c.customerName = 'Delicatessen'
RETURN p.productName, count(*) AS freq
ORDER BY freq DESC LIMIT 10;
61. openCypher...
...is a community effort to evolve Cypher, and to
make it the most useful language for querying
property graphs
openCypher implementations
SAP Hana Graph, Redis, Agens Graph, Cypher.PL, Neo4j
62. github.com/opencypher Language Artifacts
● Cypher 9 specification
● ANTLR and EBNF Grammars
● Formal Semantics (SIGMOD)
● TCK (Cucumber test suite)
● Style Guide
Implementations & Code
● openCypher for Apache Spark
● openCypher for Gremlin
● open source frontend (parser)
● ...
63. Cypher 10
● Next version of Cypher
● Actively working on natural language specification
● New features
○ Subqueries
○ Multiple graphs
○ Path patterns
○ Configurable pattern matching semantics
65. Extending Neo4j -
User Defined Procedures & Functions
Neo4j Execution Engine
User Defined
Procedure
User Defined
Functions
Applications
Bolt
User Defined Procedures & Functions let
you write custom code that is:
• Written in any JVM language
• Deployed to the Database
• Accessed by applications via Cypher
69. ”Graph analysis is possibly the single most effective
competitive differentiator for organizations pursuing data-
driven operations and decisions“
The Impact of Connected Data
70. Existing Options (so far)
•Data Processing
•Spark with GraphX, Flink with Gelly
•Gremlin Graph Computer
•Dedicated Graph Processing
•Urika, GraphLab, Giraph, Mosaic, GPS,
Signal-Collect, Gradoop
•Data Scientist Toolkit
•igraph, NetworkX, Boost in Python, R, C
71.
72. Goal: Iterate Quickly
•Combine data from sources into one graph
•Project to relevant subgraphs
•Enrich data with algorithms
•Traverse, collect, filter aggregate
with queries
•Visualize, Explore, Decide, Export
•From all APIs and Tools
73.
74. 1. Call as Cypher procedure
2. Pass in specification (Label, Prop, Query) and configuration
3. ~.stream variant returns (a lot) of results
CALL algo.<name>.stream('Label','TYPE',{conf})
YIELD nodeId, score
4. non-stream variant writes results to graph returns statistics
CALL algo.<name>('Label','TYPE',{conf})
Usage
75. Pass in Cypher statement for node- and relationship-lists.
CALL algo.<name>(
'MATCH ... RETURN id(n)',
'MATCH (n)-->(m)
RETURN id(n) as source,
id(m) as target', {graph:'cypher'})
Cypher Projection
78. Data Storage and
Business Rules Execution
Data Mining
and Aggregation
Neo4j Fits into Your Environment
Application
Graph Database Cluster
Neo4j Neo4j Neo4j
Ad Hoc
Analysis
Bulk Analytic
Infrastructure
Graph Compute Engine
EDW …
Data
Scientist
End User
Databases
Relational
NoSQL
Hadoop
79. Official Language Drivers
• Foundational drivers for popular
programming languages
• Bolt: streaming
binary wire protocol
• Authoritative mapping to
native type system,
uniform across drivers
• Pluggable into richer frameworks
JavaScript Java .NET Python PHP, ....
Drivers
Bolt
80. Bolt + Official Language Drivers
http://neo4j.com/developer/ http://neo4j.com/developer/language-guides/
81. Using Bolt: Official Language Drivers look all the same
With JavaScript
var driver = Graph.Database.driver("bolt://localhost");
var session = driver.session();
var result = session.run("MATCH (u:User) RETURN u.name");
82. neo4j.com/developer/spring-data-neo4j
Spring Data Neo4j Neo4j OGM
@NodeEntity
public class Talk {
@Id @GeneratedValue
Long id;
String title;
Slot slot;
Track track;
@Relationship(type="PRESENTS",
direction=INCOMING)
Set<Person> speaker = new HashSet<>();
}
83. Spring Data Neo4j Neo4j OGM
interface TalkRepository extends Neo4jRepository<Talk, Long> {
@Query("MATCH (t:Talk)<-[rating:RATED]-(user)
WHERE t.id = {talkId} RETURN rating")
List<Rating> getRatings(@Param("talkId") Long talkId);
List<Talk> findByTitleContaining(String title);
}
88. • Operational workloads
• Analytics workloads
Real-time Transactional
and Analytic Processing • Interactive graph exploration
• Graph representation of data
Discovery and
Visualization
• Native property graph model
• Dynamic schema
Agilit
y
• Cypher - Declarative query language
• Procedural language extensions
• Worldwide developer community
Developer Productivity
• 10x less CPU with index-free adjacency
• 10x less hardware than other platforms
Hardware efficiency
Neo4j: Graph Platform
Performance
• Index-free adjacency
• Millions of hops per second
89.
90. Index-free adjacency ensures lightning-
fast retrieval of data and relationships
Native Graph Architecture
Index free adjacency
Unlike other database models Neo4j
connects data as it is stored
91. Neo4j Query Planner
Cost based Query Planner since Neo4j
• Uses transactional database statistics
• High performance Query Engine
• Bytecode compiled queries
• Future: Parallism
92. 1
2
3
4
5
6
Architecture Components
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Security
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
93. Neo4j – allows you to connect the dots
• Was built to efficiently
• store,
• query and
• manage highly connected data
• Transactional, ACID
• Real-time OLTP
• Open source
• Highly scalable on few machines
94. High Query Performance: Some Numbers
• Traverse 2-4M+ relationships per
second and core
• Cost based query optimizer –
complex queries return in
milliseconds
• Import 100K-1M records per second
transactionally
• Bulk import tens of billions of records
in a few hours
97. How do I get it? Desktop – Container – Cloud
http://neo4j.com/download/
docker run neo4j
98. Neo4j Cluster Deployment Options
• Developer: Neo4j Desktop (free Enterprise License)
• On premise – Standalone or via OS package
• Containerized with official Docker Image
•
In the Cloud
• AWS, GCE, Azure
• Using Resource Managers
• DC/OS – Marathon
• Kubernetes
• Docker Swarm
99. 10M+
Downloads
3M+ from Neo4j Distribution
7M+ from Docker
Events
400+
Approximate Number of
Neo4j Events per Year
50k+
Meetups
Number of Meetup
Members Globally
Active Community
50k+
Trained/certified Neo4j
professionals
Trained Developers
100. Summary: Graphs allow you ...
• Keep your rich data model
• Handle relationships efficiently
• Write queries easily
• Develop applications quickly
• Have fun
104. Causal Clustering - Features
• Two Zones – Core + Edge
• Group of Core Servers – Consistent and Partition tolerant (CP)
• Transactional Writes
• Quorum Writes, Cluster Membership, Leader via Raft Consensus
• Scale out with Read Replicas
• Smart Bolt Drivers with
• Routing, Read & Write Sessions
• Causal Consistency with Bookmarks
105. • For massive query
throughput
• Read-only replicas
• Not involved in Consensus
Commit
Replica
• Small group of Neo4j
databases
• Fault-tolerant Consensus
Commit
• Responsible for data safety
Core
106. Writing to the Core Cluster
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
109. Bookmark
• Session token
• String (for portability)
• Opaque to application
• Represents ultimate user’s most
recent view of the graph
• More capabilities to come
111. Neo4j 3.0 Neo4j 3.1
High Availability
Cluster
Causal Cluster
Master-Slave architecture
Paxos consensus used for
master election
Raft protocol used for leader
election, membership changes
and
commitment of all
transactions
Two part cluster: writeable
Core and read-only read
replicas.
Transaction committed
once written durably on
the master
Transaction committed once written
durably on a majority of the core
members
Practical deployments:
10s servers
Practical deployments: 100s
servers
112. Causal Clustering - Features
• Two Zones – Core + Edge
• Group of Core Servers – Consistent and Partition tolerant (CP)
• Transactional Writes
• Quorum Writes, Cluster Membership, Leader via Raft Consensus
• Scale out with Read Replicas
• Smart Bolt Drivers with
• Routing, Read & Write Sessions
• Causal Consistency with Bookmarks
113. • For massive query
throughput
• Read-only replicas
• Not involved in Consensus
Commit
Replica
• Small group of Neo4j
databases
• Fault-tolerant Consensus
Commit
• Responsible for data safety
Core
114. Writing to the Core Cluster – Raft Consensus
Commits
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
117. Bookmark
• Session token
• String (for portability)
• Opaque to application
• Represents ultimate user’s most
recent view of the graph
• More capabilities to come
127. Case studySolving real-time recommendations for the
World’s largest retailer.
Challenge
• In its drive to provide the best web experience for its
customers, Walmart wanted to optimize its online
recommendations.
• Walmart recognized the challenge it faced in delivering
recommendations with traditional relational database
technology.
• Walmart uses Neo4j to quickly query customers’ past
purchases, as well as instantly capture any new interests
shown in the customers’ current online visit – essential
for making real-time recommendations.
Use of Neo4j
“As the current market leader in
graph databases, and with
enterprise features for scalability
and availability, Neo4j is the right
choice to meet our demands”.
- Marcos Vada, Walmart
• With Neo4j, Walmart could substitute a heavy batch
process with a simple and real-time graph database.
Result/Outcome
128. Case studyeBay Now Tackles eCommerce Delivery Service Routing with
Neo4j
Challenge
• The queries used to select the best courier for eBays
routing system were simply taking too long and they
needed a solution to maintain a competitive service.
• The MySQL joins being used created a code base too slow
and complex to maintain.
• eBay is now using Neo4j’s graph database platform to
redefine e-commerce, by making delivery of online and
mobile orders quick and convenient.
Use of Neo4j
• With Neo4j eBay managed to eliminate the biggest
roadblock between retailers and online shoppers: the
option to have your item delivered the same day.
• The schema-flexible nature of the database allowed easy
extensibility, speeding up development.
• Neo4j solution was more than 1000x faster than the prior
MySQL Soltution.
Our Neo4j solution is literally
thousands of times faster than the
prior MySQL solution, with queries
that require 10-100 times less code.
Result/Outcome
– Volker Pacher, eBay
129. Top Tier US Retailer
Case studySolving Real-time promotions for a top US
retailer
Challenge
• Suffered significant revenues loss, due to legacy
infrastructure.
• Particularly challenging when handling transaction volumes
on peak shopping occasions such as Thanksgiving and
Cyber Monday.
• Neo4j is used to revolutionize and reinvent its real-time
promotions engine.
• On an average Neo4j processes 90% of this retailer’s 35M+
daily transactions, each 3-22 hops, in 4ms or less.
Use of Neo4j
• Reached an all time high in online revenues, due to the
Neo4j-based friction free solution.
• Neo4j also enabled the company to be one of the first
retailers to provide the same promotions across both online
and traditional retail channels.
“On an average Neo4j processes
90% of this retailer’s 35M+ daily
transactions, each 3-22 hops, in
4ms or less.”
– Top Tier US Retailer
Result/Outcome
130. Relational DBs Can’t Handle Relationships Well
• Cannot model or store data and relationships
without complexity
• Performance degrades with number and levels
of relationships, and database size
• Query complexity grows with need for JOINs
• Adding new types of data and relationships
requires schema redesign, increasing time to
market
… making traditional databases inappropriate
when data relationships are valuable in real-time
Slow development
Poor performance
Low scalability
Hard to maintain
131. Unlocking Value from Your Data Relationships
• Model your data as a graph of data
and relationships
• Use relationship information in real-
time to transform your business
• Add new relationships on the fly to
adapt to your changing business