SlideShare a Scribd company logo
1 of 18
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
Fighting a Multi-armed Monster
With Graph: Master Data
Management in Neo4j
Steven Scott
Cognitive Software Engineer at Northrop Grumman
Travis Confer
Software Engineer at Northrop Grumman
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
2
Goal: Solve complex engineering
and business problems
The Problem Emerges
• A single problem may require
data from multiple systems
• Systems do not generally
interface nicely
• Concepts often span multiple
data stores
“I need data from both system A
and system B to do X, so I’ll just
shuttle some data from A into B.”
– Common thought pattern
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
Previous Hodge-Podge of Data
Management Approaches
Manually duplicate data from
one system to another 💀💀💀
Manual export + import of data
between systems 💀💀
Ad hoc scripts to shuttle data💀
Automated script runner
3
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
4
Data Shuttling Problem and Pain Points
Data duplication
Synchronization issues
Which version to trust?
Time consuming to create pair-wise connections
Non-uniform data review
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
5
Graph-First Approach
• Graph-based approach to data modeling
• Declarative
• To learn more, see “Accelerating ML Ops
with Graphs and Ontology-Driven Design”
Ontology Driven
Design
• GraphQL
• React
• Apollo
• Neo4j Database
GRAND Stack
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
6
Kraken - Domain Structure
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
7
Publisher – Subscriber Model
Publisher Subscriber
Subscribes To
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
Data Structure
8
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
9
Needed Functionality
Determine whether data is out-of-
sync
Audit history of changes
Review and certify data
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
10
ASOT: Authorized for a
Use Case
ASOT: Authored in some
system
A human in authority reviews a
particular piece of data,
determines that it is accurate,
and gives it a stamp of approval
An application generates the
pieces of data. Since that is the
original source of the data, it is
considered authoritative
Two conflated definitions for the Authoritative
Source of Truth (ASOT)
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
11
Tracking Digital Threads
Tracing data from cradle to grave
Needed to maintain an Authoritative Source
of Truth (ASOT)
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
Start Update
Current
State
by
Jane
by
Joe
© 2022 Neo4j, Inc. All rights reserved.
12
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
Kraken Demo
© 2022 Neo4j, Inc. All rights reserved.
13
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
14
Why Graphs are Useful for
Master Data Management
Dependencies
are
transparent
Analytics can
be done in the
graph
Apparent
where data is
authored
Apparent what
data has been
authorized
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
15
Drawbacks of this Approach
Single-value
nodes not ideal in
Neo4j
Results in
considerable data
storage scale-up
Some data types
are problematic
Image data Blobs/binary files
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
16
Benefits of this Approach
Transparent
system
dependencies
Remove
unnecessary
data shuttling
Subscribe to
true source
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
17
When is this Approach Most Appropriate?
Data is being copied
into many systems
Requires data
tracking a
granular level
System
interconnectivity
is high
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
18
Thank you!
Approved for Public Release: NG22-0878 © 2022, Northrop Grumman

More Related Content

Similar to Fighting a Multi-armed Monster With Graph: Master Data Management in Neo4j

Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Neo4j
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
Neo4j
 

Similar to Fighting a Multi-armed Monster With Graph: Master Data Management in Neo4j (20)

Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data ScienceNeo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
 
Optimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jOptimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4j
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
 
Nodes 2023 - Knowledge graph based chatbot.pptx
Nodes 2023 - Knowledge graph based chatbot.pptxNodes 2023 - Knowledge graph based chatbot.pptx
Nodes 2023 - Knowledge graph based chatbot.pptx
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
 
Bootstrapping Your Graph Project with Neo4j Data Importer and Browser.pptx
Bootstrapping Your Graph Project with Neo4j Data Importer and Browser.pptxBootstrapping Your Graph Project with Neo4j Data Importer and Browser.pptx
Bootstrapping Your Graph Project with Neo4j Data Importer and Browser.pptx
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
 
Government GraphSummit: Digital Transformation with Graphs, Ontology and ML Ops
Government GraphSummit: Digital Transformation with Graphs, Ontology and ML OpsGovernment GraphSummit: Digital Transformation with Graphs, Ontology and ML Ops
Government GraphSummit: Digital Transformation with Graphs, Ontology and ML Ops
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
 
Visualizing Insights with Bloom and Graph Data Science.pptx
Visualizing Insights with Bloom and Graph Data Science.pptxVisualizing Insights with Bloom and Graph Data Science.pptx
Visualizing Insights with Bloom and Graph Data Science.pptx
 
GraphSummit Toronto: Keynote - Innovating with Graphs
GraphSummit Toronto: Keynote - Innovating with Graphs GraphSummit Toronto: Keynote - Innovating with Graphs
GraphSummit Toronto: Keynote - Innovating with Graphs
 
Transforming BT’s Infrastructure Management with Graph Technology
Transforming BT’s Infrastructure Management with Graph TechnologyTransforming BT’s Infrastructure Management with Graph Technology
Transforming BT’s Infrastructure Management with Graph Technology
 
Road to NODES 2023: Graphing Relational Databases
Road to NODES 2023: Graphing Relational DatabasesRoad to NODES 2023: Graphing Relational Databases
Road to NODES 2023: Graphing Relational Databases
 
Neo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph TechnologyNeo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph Technology
 
Towards GQL 1 — A Property Graph Query Language Standard
Towards GQL 1 — A Property Graph Query Language StandardTowards GQL 1 — A Property Graph Query Language Standard
Towards GQL 1 — A Property Graph Query Language Standard
 
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
Scale Your Mission-Critical Applications With Neo4j Fabric and Clustering Arc...
 
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
 

More from Neo4j

More from Neo4j (20)

Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphGraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
 
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
LARUS - Galileo.XAI e Gen-AI: la nuova prospettiva di LARUS per il futuro del...
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Fighting a Multi-armed Monster With Graph: Master Data Management in Neo4j

  • 1. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. Fighting a Multi-armed Monster With Graph: Master Data Management in Neo4j Steven Scott Cognitive Software Engineer at Northrop Grumman Travis Confer Software Engineer at Northrop Grumman Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 2. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. 2 Goal: Solve complex engineering and business problems The Problem Emerges • A single problem may require data from multiple systems • Systems do not generally interface nicely • Concepts often span multiple data stores “I need data from both system A and system B to do X, so I’ll just shuttle some data from A into B.” – Common thought pattern Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 3. © 2022 Neo4j, Inc. All rights reserved. Previous Hodge-Podge of Data Management Approaches Manually duplicate data from one system to another 💀💀💀 Manual export + import of data between systems 💀💀 Ad hoc scripts to shuttle data💀 Automated script runner 3 Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 4. © 2022 Neo4j, Inc. All rights reserved. 4 Data Shuttling Problem and Pain Points Data duplication Synchronization issues Which version to trust? Time consuming to create pair-wise connections Non-uniform data review Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 5. © 2022 Neo4j, Inc. All rights reserved. 5 Graph-First Approach • Graph-based approach to data modeling • Declarative • To learn more, see “Accelerating ML Ops with Graphs and Ontology-Driven Design” Ontology Driven Design • GraphQL • React • Apollo • Neo4j Database GRAND Stack Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 6. © 2022 Neo4j, Inc. All rights reserved. 6 Kraken - Domain Structure Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 7. © 2022 Neo4j, Inc. All rights reserved. 7 Publisher – Subscriber Model Publisher Subscriber Subscribes To Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 8. © 2022 Neo4j, Inc. All rights reserved. Data Structure 8 Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 9. © 2022 Neo4j, Inc. All rights reserved. 9 Needed Functionality Determine whether data is out-of- sync Audit history of changes Review and certify data Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 10. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. 10 ASOT: Authorized for a Use Case ASOT: Authored in some system A human in authority reviews a particular piece of data, determines that it is accurate, and gives it a stamp of approval An application generates the pieces of data. Since that is the original source of the data, it is considered authoritative Two conflated definitions for the Authoritative Source of Truth (ASOT) Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 11. © 2022 Neo4j, Inc. All rights reserved. 11 Tracking Digital Threads Tracing data from cradle to grave Needed to maintain an Authoritative Source of Truth (ASOT) Approved for Public Release: NG22-0878 © 2022, Northrop Grumman Start Update Current State by Jane by Joe
  • 12. © 2022 Neo4j, Inc. All rights reserved. 12 Approved for Public Release: NG22-0878 © 2022, Northrop Grumman Kraken Demo
  • 13. © 2022 Neo4j, Inc. All rights reserved. 13 Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 14. © 2022 Neo4j, Inc. All rights reserved. 14 Why Graphs are Useful for Master Data Management Dependencies are transparent Analytics can be done in the graph Apparent where data is authored Apparent what data has been authorized Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 15. © 2022 Neo4j, Inc. All rights reserved. 15 Drawbacks of this Approach Single-value nodes not ideal in Neo4j Results in considerable data storage scale-up Some data types are problematic Image data Blobs/binary files Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 16. © 2022 Neo4j, Inc. All rights reserved. 16 Benefits of this Approach Transparent system dependencies Remove unnecessary data shuttling Subscribe to true source Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 17. © 2022 Neo4j, Inc. All rights reserved. 17 When is this Approach Most Appropriate? Data is being copied into many systems Requires data tracking a granular level System interconnectivity is high Approved for Public Release: NG22-0878 © 2022, Northrop Grumman
  • 18. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. 18 Thank you! Approved for Public Release: NG22-0878 © 2022, Northrop Grumman

Editor's Notes

  1. (Introduce ourselves, then Northrop Grumman) Northrop Grumman is a Fortune 500 aerospace and defense technology company with over 90,000 employees. You may not have heard of us since our products are generally sold to governments, not individuals. Within Northrop, we are part of the Digital Transformation Organization, which focuses on improving business and engineering processes through modern software systems and data science practices. As you might have guessed by our presence at this conference, we often use graphs to accomplish this.
  2. Solving complex engineering and business problems generally requires varied data spread across multiple applications. Many projects at Northrop Grumman are initially complex, with unnecessary complexity added because data is spread across so many systems. As we began looking at this problem last summer, we found that a common though pattern was to just shuttle data from one application to another. As we’ll see in the subsequent slides, this leads to many problems.
  3. We realized that, as people sought to found various fragmented approaches to dealing with this data management challenge. This desire to simply shuttle data between systems without a unified approach leads to sub-optimal solutions. While manually entering data from one system into another isn’t commonly used, we did find instances where this was the data management approach. Obviously, this is labor-intensive, error-prone, and generally just a bad idea. Manually exporting the data from one system and importing it into another is a slight improvement over manual entry. However, it still requires a human to export/import data between systems and many systems do not allow data from another system to be cleanly imported. To remove the human-in-the-loop aspect of the upper solutions, people will often write some sort of script to automatically move data from one system to another. While this does improve from manual management, it requires an additional shuttling script for each system that You’ve probably watched people work their way up through this sequence of approaches if your company is anything like ours. While exposing data via APIs is certainly preferable, we found that it wasn’t always possible at Northrop to just provide API access to a set of data. Sometimes, due to functionality in a particular app, a contract requirement, or other constraint, data based on multiple systems MUST end up in a destination system.
  4. If all you’ve done is lifted and shifted your data, you’ve missed the point Shuttling data between applications causes data duplication This can cause confusion as to which duplicate is to be trusted if they don’t match Shuttling data between applications causes synchronization issues Time consuming to create connections between each pair of applications If data is certified for a use case and then becomes updated, the certification must be re-reviewed Legacy applications lack interfaces with modern applications and vice versa
  5. Solutions often structurally reflect the problem they are designed to solve Some claim the structure of CIA reflects Soviet Union structure (brain/leader at top, siloed organizations sending data up to the top and receiving decisions) Kraken has many arms connected to a central hub, which reflects the many-armed nature of the problem Prototype proposal As this problem seemed very graph-y, we submitted a proposal to build a prototype graph-based master data management solution Received funding and built the prototype Mention briefly, point people to the ODD talk TODO: put GRAND stack logos
  6. Each system becomes a “tentacle” that plugs into the central hub Instead of creating connections between each pair of systems, each system just needs to connect into Kraken. Once a tentacle is connected to Kraken, the data in that tentacle can subscribe to data in any other tentacle. Conversely, each other tentacle can subscribe to the data in the newly connected tentacle. We divided the problem up into various domains, which represent a small part of the overall solution. For each tentacle, we identified the need for a source domain, capturing information about where the data initially comes from. Then, we layered on structure based on how the application structures itself. Then, we layered on semantics, which add more knowledge about what things relate to each other and what they mean to the user.
  7. When you do need data to be synchronized between systems, creating a publisher – subscriber structure is an easy way to allow one item to “listen” to another, notifying the user when a change occurs
  8. Our data management application would ideally provide several key functionalities Allow the user to create subscriptions (publisher – subscriber model) between specific data items Provide an auditable history of changes for key data items Allow data to be certified for a use case Provide an auditable history of certifications for data items Notify the user when certified data has been modified
  9. While working on this problem, we discovered that, depending on who we talked to, they would talk about ASOTs in one of these two ways.
  10. A digital thread is hot topic in the defense industry. It allows tracking data lineage and auditing changes over time. The publisher subscriber model and history of changes for an individual atom create, from Kraken’s point of view, a digital thread.
  11. Fabricated data in the same structure as Tasks (Activities) in Jira and Activities in Primavera
  12. Made using fabricated data. Jira is an Atlassian product to track progress on tasking, among other things. Primavera is an Oracle product to manage project portfolios.
  13. Move subscriptions: A->B->C, then A->C is allowed