Submit Search
Upload
A Fusion of Machine Learning and Graph Analysis for Free-Form Data Entry Clustering
•
Download as PPTX, PDF
•
0 likes
•
53 views
Neo4j
Follow
A Fusion of Machine Learning and Graph Analysis for Free-Form Data Entry Clustering
Read less
Read more
Technology
Report
Share
Report
Share
1 of 18
Download now
Recommended
Towards GQL 1 — A Property Graph Query Language Standard
Towards GQL 1 — A Property Graph Query Language Standard
Neo4j
Knowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph Composition
Neo4j
Demystifying Graph Neural Networks
Demystifying Graph Neural Networks
Neo4j
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
Neo4j
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data Science
Neo4j
Idiomatic Domain Driven Design: implementing CQRS
Idiomatic Domain Driven Design: implementing CQRS
Andrea Saltarello
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
Neo4j
Graph Data Science at Scale
Graph Data Science at Scale
Neo4j
Recommended
Towards GQL 1 — A Property Graph Query Language Standard
Towards GQL 1 — A Property Graph Query Language Standard
Neo4j
Knowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph Composition
Neo4j
Demystifying Graph Neural Networks
Demystifying Graph Neural Networks
Neo4j
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
Neo4j
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data Science
Neo4j
Idiomatic Domain Driven Design: implementing CQRS
Idiomatic Domain Driven Design: implementing CQRS
Andrea Saltarello
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
Neo4j
Graph Data Science at Scale
Graph Data Science at Scale
Neo4j
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
Dr. Ananth Krishnamoorthy
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Neo4j
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
Neo4j
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
Neo4j
2019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v2
Tao Wang
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Neo4j
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
Neo4j
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
Neo4j
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Demi Ben-Ari
Final project report format
Final project report format
Masud Sarkar
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Hannes Tschofenig
Big data & frameworks: no book for you anymore.
Big data & frameworks: no book for you anymore.
Roman Nikitchenko
Big data & frameworks: no book for you anymore
Big data & frameworks: no book for you anymore
Stfalcon Meetups
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Data Con LA
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j
Trends in DNN compression
Trends in DNN compression
Kaushalya Madhawa
Component Based Model Driven Development of Mission Critical Defense Applicat...
Component Based Model Driven Development of Mission Critical Defense Applicat...
Remedy IT
Road to NODES 2023: Graphing Relational Databases
Road to NODES 2023: Graphing Relational Databases
Neo4j
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Eduardo Gaspar
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
More Related Content
Similar to A Fusion of Machine Learning and Graph Analysis for Free-Form Data Entry Clustering
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
Dr. Ananth Krishnamoorthy
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Neo4j
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
Neo4j
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
Neo4j
2019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v2
Tao Wang
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Neo4j
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
Neo4j
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
Neo4j
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Demi Ben-Ari
Final project report format
Final project report format
Masud Sarkar
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Hannes Tschofenig
Big data & frameworks: no book for you anymore.
Big data & frameworks: no book for you anymore.
Roman Nikitchenko
Big data & frameworks: no book for you anymore
Big data & frameworks: no book for you anymore
Stfalcon Meetups
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Data Con LA
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j
Trends in DNN compression
Trends in DNN compression
Kaushalya Madhawa
Component Based Model Driven Development of Mission Critical Defense Applicat...
Component Based Model Driven Development of Mission Critical Defense Applicat...
Remedy IT
Road to NODES 2023: Graphing Relational Databases
Road to NODES 2023: Graphing Relational Databases
Neo4j
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Eduardo Gaspar
Similar to A Fusion of Machine Learning and Graph Analysis for Free-Form Data Entry Clustering
(20)
Keras: A versatile modeling layer for deep learning
Keras: A versatile modeling layer for deep learning
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
2019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v2
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Final project report format
Final project report format
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Big data & frameworks: no book for you anymore.
Big data & frameworks: no book for you anymore.
Big data & frameworks: no book for you anymore
Big data & frameworks: no book for you anymore
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Neo4j: The path to success with Graph Database and Graph Data Science
Neo4j: The path to success with Graph Database and Graph Data Science
Trends in DNN compression
Trends in DNN compression
Component Based Model Driven Development of Mission Critical Defense Applicat...
Component Based Model Driven Development of Mission Critical Defense Applicat...
Road to NODES 2023: Graphing Relational Databases
Road to NODES 2023: Graphing Relational Databases
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
More from Neo4j
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
Neo4j
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Neo4j
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
Neo4j
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
Neo4j
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
Neo4j
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
Neo4j
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
Neo4j
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
Neo4j
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
Neo4j
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
Neo4j
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
Neo4j
More from Neo4j
(20)
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
Recently uploaded
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
charlottematthew16
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
gvaughan
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
The Digital Insurer
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Enterprise Knowledge
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
Miki Katsuragi
costume and set research powerpoint presentation
costume and set research powerpoint presentation
phoebematthew05
Recently uploaded
(20)
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
costume and set research powerpoint presentation
costume and set research powerpoint presentation
A Fusion of Machine Learning and Graph Analysis for Free-Form Data Entry Clustering
1.
© 2022 Neo4j,
Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. A Fusion of Machine Learning and Graph Analysis for Free- Form Data Entry Clustering Dr. Andrew Flinders Joel Linford Data Scientists Northrop Grumman Corporation – Space Sector
2.
© 2022 Neo4j,
Inc. All rights reserved. Repair Narratives Building Clusters BERT Embeddings [1] Constelations 2
3.
© 2022 Neo4j,
Inc. All rights reserved. 3 Motivation Problem • Maintenance Records • Need to identify patterns and structures present in free form text • Finding general topics can be challenging Hypothesis • We hypothesized that the combination of large language models (deep learning), clustering techniques (shallow learning), and graph databases (graph algorithms) could be used to map and retain these patterns.
4.
© 2022 Neo4j,
Inc. All rights reserved. Narratives – free form text with vital info • 10 REPLACED ALL PISTONS • 11 CLEANED HUBCAPS • 12 COMPLETED DRIVE SHAFT CO • 13 REPLACED WATER PUMP • 14 SCHEDULED MAINTENANCE • 15 CLEANED INJECTORS • 16 CLEANED FLOOR MATS • 17 PATCHED WIRING IN CAB • 18 LABELED SEATING ASSIGNMENTS • 19 NO FOUND ON SPARK PLUGS 4 This technique will work for any free form text where there is a reason to believe that there are patterns or trends. Here are a couple of examples of the text we are working with.
5.
© 2022 Neo4j,
Inc. All rights reserved. BERT Embeddings [1] BERT is a language model which embeds text into semantically sensitive vectors (as opposed to a Bag of Words model, which is mostly semantically insensitive.) These vectors are extremely effective at allowing text to be used for machine learning. BERT is a Deep Neural Net (bringing Deep learning and Transfer Learning to play.) How BERT was trained (and why it did not use twitter) 5
6.
© 2022 Neo4j,
Inc. All rights reserved. Using BERT [1] embeddings for clustering 6 Image from “Attention Is All You Need.” [1] “We fixed the thing” –[BERT]-> [0.124, 0.432, 0.4523, ….. , 1.2432]
7.
© 2022 Neo4j,
Inc. All rights reserved. Clustering Algorithms We tested several clustering algorithms. My favorite is the OPTICS [2] clustering algorithm as implemented by Sci-Kit Learn [3]. We also tested the DBScan [4] Method and the KNN [5] method. 7 The optics algorithm detects dense groupings in the data and designates those as cluster cores. It then allows the cluster to grow to a certain point and excludes outliers. This can be helpful for identifying unique entries. Image from https://scikit- learn.org/stable/auto_examples/ cluster/plot_optics.html#sphx-glr- auto-examples-cluster-plot- optics-py [3]
8.
© 2022 Neo4j,
Inc. All rights reserved. 8 Cluster 1 Cluster 2 Cluster 3 Cluster 4 WATER PUMP SEIZEDREQS WATER PUMP REPLACE REPLACE WATER PUMP REPLACE WATER PUMP REPLACE WATER PUMP REQS CO REPLACE WATER PUMP REPLACED THE WATER PUMP ALL CODES CLEARED REPLACED THE WATER PUMP ALL CODES CLEARED REPLACED WATER PUMP CODES CLEARED REPLACED WATER PUMP ALL CODES CLEARED REPLACED WATER PUMP ALL CODES CLEARED TEAM CLEANED PLUGS ALL CODES CLEARED TEAM CLEANED PLUGS ALL CODES CLEARED TEAM CLEANED PLUGS ALL CODES CLEARED TEAM CLEANED PLUGS ALL CODES CLEARED AIR CONDITIONER LEAKING REFRIGERANT REPLACED AIR CONDITIONER WAS LEAKING REFRIGERANT AIR CONDITIONER CHILLER ALL CODES CLEARED REPLACED AIR CONDITIONER FOR LOW REFRIGERANT
9.
© 2022 Neo4j,
Inc. All rights reserved. 9 Similarity within Clusters 1 Performed Corrosion Control Performed Corrosion Control Performed Corrosion Control Corrosion Control Performed Corosion Control Performed CC [in] [in] [in] [in] [in] [in] Key: Narrative Cluster
10.
© 2022 Neo4j,
Inc. All rights reserved. 10 Dissimilarity for Un-clustered Entries Perf MaintCOntrol Sandwich found in pump 31542.1240 • Really unique problems • Misspelled entries • Data entered incorrectly CND
11.
© 2022 Neo4j,
Inc. All rights reserved. 11 1. Average embeddings saved on each cluster 2. Euclidean distance calculated between each cluster center Cluster Linking Change colors Key: Narrative Cluster
12.
© 2022 Neo4j,
Inc. All rights reserved. 12 Current/Future Work: Graph Algorithms • Centrality – what are the most important nodes? • Pathfinding • Similarity • Community Detection • What graph algorithms have you guys used and had success with? We are looking to try some soon.
13.
© 2022 Neo4j,
Inc. All rights reserved. 13 Text Language Model Clusteri ng Graph Building Graph Algos Fine tuning Clustering Algos Graph Design LABEL EDGES [input] [then] [then] [then] [adjustable] [adjustable] [adjustable]
14.
© 2022 Neo4j,
Inc. All rights reserved. 14 Current/Future Work: Fine tuning BERT [1] • Fine tuning a language model will improve efficacy with our dataset (probably helpful in almost every application.) • Considering the training set for language models, it is probable that they will struggle with slang and modern connotations. (Someone should study this.) • We have been looking into fine tuning, and we think we have it working, but we have not tested it yet.
15.
© 2022 Neo4j,
Inc. All rights reserved. 15 Current/Future Work: Summary Stats • Clearly the clusters have meaning ◦ But summary statistics seem uninteresting • This is likely due to over-simplification ◦ I.e. reduction of 500+ degree vectors to one Euclidean distance just lost too much information. • Averaging seems to be… ok but not amazing.
16.
© 2022 Neo4j,
Inc. All rights reserved. 16 Patterns interlocking systems Fluid Change Valve Gasket Piston Engine Transmition Sched. Maint. [subset_of] [subset_of] [subset_of] [subset_of] [subset_of] [subset_of] [subset_of] Key: Broad Clustering Narrow Clustering
17.
© 2022 Neo4j,
Inc. All rights reserved. 17
18.
© 2022 Neo4j,
Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. 18 Thank you!
Download now