SlideShare a Scribd company logo
1 of 19
Download to read offline
A journey in developing an
open-source application on
Neo4j
GraphSummit Copenhagen, 7th March 2024
Mikkel Traun, Principal System Developer, Novo Nordisk A/S
Henrik Enquist, PhD, Lead Software Developer, Novo Nordisk A/S
Novo Nordisk®
What is the OpenStudyBuilder?…
A NEW APPROACH TO STUDY
SPECIFICATION
• Compliance with external and internal standards
• Facilitates automation and content reuse
• Ensures a higher degree of end-to-end consistency
2
3 ELEMENTS OF OpenStudyBuilder
• Clinical Metadata Repository (clinical MDR)
(central repository for all study specification data)
• OpenStudyBuilder application / Web UI
• API layer
(allowing interoperability with other applications)
(DDF API Adaptor – enabling DDF SDR Compatibility)
clinical MDR
Novo Nordisk®
OpenStudyBuilder Components
3
STUDIES
TITLE CRITERIA
REGISTRY IDENTIFERS INTERVENTIONS
STRUCTURE PURPOSE
POPULATION ACTVITIES
LIBRARY
CONTROLLED
TERMINOLOGY
MEDICAL DICTIONARIES
(e.g., MedDRA)
CONCEPTS (ACTIVITIES,
UNITS, CRFs, COMPOUNDS)
SYNTAX TEMPLATES
DATA EXCHANGE STANDARDS
Novo Nordisk®
Applying Domain Driven Design principles
• Domain Driven Design (DDD) is a common
design pattern in system development
• Works well with the linked graph database
design
• Ending up with a very big data model – as
the domain is complex – maybe OK
• Can be complicated in the Python based
API component
• Challenges in using Neo4j OGM NeoModel
with DDD principles in Python
4
Novo Nordisk®
Benefits working with linked graph model
• Big and complex data domain is captured well in
the label property graph model
• Given you understand the
clinical data domain!
• Support fine granular versioning
• Support domain driven queries
• Can deliver fast performance
• Other MDR solutions have applied generic
relational data models
• Having difficulties in managing versioning at a
granular level
• Very complex and long running queries
5
Novo Nordisk®
Technical details| MDR and StudyBuilder
Vue.js: https://vuejs.org/ Vuetify: https://vuetifyjs.com/en/ VuePress: https://vuepress.vuejs.org/ FastAPI: https://fastapi.tiangolo.com/ Neo4j: https://neo4j.com/neo4j-graph-database/
Front-end
• Vue.js
• Modern JavaScript web framework
• Vuetify user interface styling library
• Websites can be displayed on all operating
systems
• Views automatically adjust to underlying
data
• Pages are created with re-usable
components, e.g. a table or a visualisation.
Code once, reuse many times.
• Wide usage:
• Popular among developers
• Google, Apple, Netflix
• VuePress Documentation portal
Service layer
• Python driven API
• FastAPI Framework
• Highly readable code
• More than 40% of developers use Python,
according to Stack Overflow
• Restful API
• CDISC use python for their API as well
• Automatically generate API
documentation with inline code comments
• Lightweight
• Cloud hosting (Azure) provides elastic
scaling – upscale and downscale
according to immediate usage
Backend
• Neo4j database
• Modern label-property graph database
• Data model is close to the domain
• Hosted on any cloud or locally
Novo Nordisk®
Challenges using neomodel versus plain Cypher
- neomodel can be fast
https://github.com/neo4j-contrib/neomodel
Novo Nordisk®
Challenges using neomodel versus plain Cypher
- neomodel can be slow
• Fetching many items with several properties 🡪 many queries, latencies add up
• A single “big” cypher query is much faster but is more work to write and maintain
• neomodel can do better
items = root_class.nodes.order_by(”-name")
for item in items:
obj_a = item.has_obj_a.get_or_none()
obj_b = item.has_obj_b.get_or_none()
obj_c = item.has_obj_c.get_or_none()
obj_d = item.has_obj_d.get_or_none()
items = root_class.nodes.all().fetch_relationships("obj_a", "obj_b", ...)
Novo Nordisk®
Challenges using neomodel versus plain Cypher
9
Make simple implementation with neomodel
If performance is satisfactory, done
If not, refactor to utilize neomodel better
• Or, if not possible, reimplement with Cypher
Keep track of performance as data amount increases
Novo Nordisk®
Challenges using neomodel versus plain Cypher
Dummy data:
• Simple
• Small
• Uses a subset of the data model
Production data:
• Complicated
• Big
• Uses (nearly) the full data model
• Need better dummy data!
• Generative AI can help to create richer dummy data
10
Novo Nordisk®
Challenges showing graph data in tables
• Vuetify Data table
11
Novo Nordisk®
Challenges showing graph data in tables
• StudyBuilder
list of Activities
12
Novo Nordisk®
Changing data models and data migrations
• Model continuously evolves
• Example: Adding an intermediate node between two existing
• Migration needed!
13
Novo Nordisk®
Changing data models and data migrations
We need:
- Migration script (usually Cypher)
- Verification script (usually Cypher)
- Test data, preferably a sample extracted from production.
Test:
- Inject test data in a temporary DB
- Run migration
- Run verification
- Run migration again, assert that nothing changed
Challenge:
- Documenting the changes
14
Novo Nordisk®
Changing data models and data migrations
- Change Data Capture
15
@capture_changes() decorator
1) Extract docstring from wrapped function
2) Enable log enrichment
3) Query for the current change id
4) Run the wrapped function
5) Query for the changes
6) Dump the changes to a json file
7) Generate a summary and append to a
markdown file
Novo Nordisk®
Changing data models and data migrations
- Change Data Capture
16
Novo Nordisk®
Neo4j Enterprise and open-source sharing
17
- Enterprise features are very useful.
- Who are the users of your open-source application?
- Mainly other pharma's – they need full support so not an issue
- Non-profit organizations – they can get a free license so not an issue
- Smaller biotech's and CRO’s – they can have an issue with costs!
- Can the functionality using Enterprise features be optional?
- Will require a dedicated deployment, data level access control, will
impact system validation and potential performance
Novo Nordisk®
Summary
• Neo4j database as store for enterprise application
• Big and complex data domain is captured well in the label property graph model
• Graph data model allows data to grow without getting more complicated
• Some challenges in how to present data to users
18
Thanks!
Questions?

More Related Content

Similar to Novo Nordisk's journey in developing an open-source application on Neo4j

Make your Microservices sing!
Make your Microservices sing!Make your Microservices sing!
Make your Microservices sing!Craig Barr
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoopgluent.
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightabhijit2511
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingCascading
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresqlbotsplash.com
 
Student Industrial Training Presentation Slide
Student Industrial Training Presentation SlideStudent Industrial Training Presentation Slide
Student Industrial Training Presentation SlideKhairul Filhan
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE Project
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Ola Spjuth
 
MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?DrupalCamp Kyiv
 
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDBMongoDB
 
Postgres NoSQL - Delivering Apps Faster
Postgres NoSQL - Delivering Apps FasterPostgres NoSQL - Delivering Apps Faster
Postgres NoSQL - Delivering Apps FasterEDB
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?Neo4j
 
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroliOptymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroliEDB
 

Similar to Novo Nordisk's journey in developing an open-source application on Neo4j (20)

Make your Microservices sing!
Make your Microservices sing!Make your Microservices sing!
Make your Microservices sing!
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Student Industrial Training Presentation Slide
Student Industrial Training Presentation SlideStudent Industrial Training Presentation Slide
Student Industrial Training Presentation Slide
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...Automating the process of continuously prioritising data, updating and deploy...
Automating the process of continuously prioritising data, updating and deploy...
 
MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?
 
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 
Postgres NoSQL - Delivering Apps Faster
Postgres NoSQL - Delivering Apps FasterPostgres NoSQL - Delivering Apps Faster
Postgres NoSQL - Delivering Apps Faster
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?
 
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroliOptymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
Optymalizacja środowiska Open Source w celu zwiększenia oszczędności i kontroli
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 

More from Neo4j

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 

More from Neo4j (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Novo Nordisk's journey in developing an open-source application on Neo4j

  • 1. A journey in developing an open-source application on Neo4j GraphSummit Copenhagen, 7th March 2024 Mikkel Traun, Principal System Developer, Novo Nordisk A/S Henrik Enquist, PhD, Lead Software Developer, Novo Nordisk A/S
  • 2. Novo Nordisk® What is the OpenStudyBuilder?… A NEW APPROACH TO STUDY SPECIFICATION • Compliance with external and internal standards • Facilitates automation and content reuse • Ensures a higher degree of end-to-end consistency 2 3 ELEMENTS OF OpenStudyBuilder • Clinical Metadata Repository (clinical MDR) (central repository for all study specification data) • OpenStudyBuilder application / Web UI • API layer (allowing interoperability with other applications) (DDF API Adaptor – enabling DDF SDR Compatibility) clinical MDR
  • 3. Novo Nordisk® OpenStudyBuilder Components 3 STUDIES TITLE CRITERIA REGISTRY IDENTIFERS INTERVENTIONS STRUCTURE PURPOSE POPULATION ACTVITIES LIBRARY CONTROLLED TERMINOLOGY MEDICAL DICTIONARIES (e.g., MedDRA) CONCEPTS (ACTIVITIES, UNITS, CRFs, COMPOUNDS) SYNTAX TEMPLATES DATA EXCHANGE STANDARDS
  • 4. Novo Nordisk® Applying Domain Driven Design principles • Domain Driven Design (DDD) is a common design pattern in system development • Works well with the linked graph database design • Ending up with a very big data model – as the domain is complex – maybe OK • Can be complicated in the Python based API component • Challenges in using Neo4j OGM NeoModel with DDD principles in Python 4
  • 5. Novo Nordisk® Benefits working with linked graph model • Big and complex data domain is captured well in the label property graph model • Given you understand the clinical data domain! • Support fine granular versioning • Support domain driven queries • Can deliver fast performance • Other MDR solutions have applied generic relational data models • Having difficulties in managing versioning at a granular level • Very complex and long running queries 5
  • 6. Novo Nordisk® Technical details| MDR and StudyBuilder Vue.js: https://vuejs.org/ Vuetify: https://vuetifyjs.com/en/ VuePress: https://vuepress.vuejs.org/ FastAPI: https://fastapi.tiangolo.com/ Neo4j: https://neo4j.com/neo4j-graph-database/ Front-end • Vue.js • Modern JavaScript web framework • Vuetify user interface styling library • Websites can be displayed on all operating systems • Views automatically adjust to underlying data • Pages are created with re-usable components, e.g. a table or a visualisation. Code once, reuse many times. • Wide usage: • Popular among developers • Google, Apple, Netflix • VuePress Documentation portal Service layer • Python driven API • FastAPI Framework • Highly readable code • More than 40% of developers use Python, according to Stack Overflow • Restful API • CDISC use python for their API as well • Automatically generate API documentation with inline code comments • Lightweight • Cloud hosting (Azure) provides elastic scaling – upscale and downscale according to immediate usage Backend • Neo4j database • Modern label-property graph database • Data model is close to the domain • Hosted on any cloud or locally
  • 7. Novo Nordisk® Challenges using neomodel versus plain Cypher - neomodel can be fast https://github.com/neo4j-contrib/neomodel
  • 8. Novo Nordisk® Challenges using neomodel versus plain Cypher - neomodel can be slow • Fetching many items with several properties 🡪 many queries, latencies add up • A single “big” cypher query is much faster but is more work to write and maintain • neomodel can do better items = root_class.nodes.order_by(”-name") for item in items: obj_a = item.has_obj_a.get_or_none() obj_b = item.has_obj_b.get_or_none() obj_c = item.has_obj_c.get_or_none() obj_d = item.has_obj_d.get_or_none() items = root_class.nodes.all().fetch_relationships("obj_a", "obj_b", ...)
  • 9. Novo Nordisk® Challenges using neomodel versus plain Cypher 9 Make simple implementation with neomodel If performance is satisfactory, done If not, refactor to utilize neomodel better • Or, if not possible, reimplement with Cypher Keep track of performance as data amount increases
  • 10. Novo Nordisk® Challenges using neomodel versus plain Cypher Dummy data: • Simple • Small • Uses a subset of the data model Production data: • Complicated • Big • Uses (nearly) the full data model • Need better dummy data! • Generative AI can help to create richer dummy data 10
  • 11. Novo Nordisk® Challenges showing graph data in tables • Vuetify Data table 11
  • 12. Novo Nordisk® Challenges showing graph data in tables • StudyBuilder list of Activities 12
  • 13. Novo Nordisk® Changing data models and data migrations • Model continuously evolves • Example: Adding an intermediate node between two existing • Migration needed! 13
  • 14. Novo Nordisk® Changing data models and data migrations We need: - Migration script (usually Cypher) - Verification script (usually Cypher) - Test data, preferably a sample extracted from production. Test: - Inject test data in a temporary DB - Run migration - Run verification - Run migration again, assert that nothing changed Challenge: - Documenting the changes 14
  • 15. Novo Nordisk® Changing data models and data migrations - Change Data Capture 15 @capture_changes() decorator 1) Extract docstring from wrapped function 2) Enable log enrichment 3) Query for the current change id 4) Run the wrapped function 5) Query for the changes 6) Dump the changes to a json file 7) Generate a summary and append to a markdown file
  • 16. Novo Nordisk® Changing data models and data migrations - Change Data Capture 16
  • 17. Novo Nordisk® Neo4j Enterprise and open-source sharing 17 - Enterprise features are very useful. - Who are the users of your open-source application? - Mainly other pharma's – they need full support so not an issue - Non-profit organizations – they can get a free license so not an issue - Smaller biotech's and CRO’s – they can have an issue with costs! - Can the functionality using Enterprise features be optional? - Will require a dedicated deployment, data level access control, will impact system validation and potential performance
  • 18. Novo Nordisk® Summary • Neo4j database as store for enterprise application • Big and complex data domain is captured well in the label property graph model • Graph data model allows data to grow without getting more complicated • Some challenges in how to present data to users 18