Nodes 2023 - Knowledge graph based chatbot.pptx

© 2022 Neo4j, Inc. All rights reserved.
1
Knowledge-graph-based chatbot
By Tomaž Bratanič

About me
Tomaž Bratanič - Graph & LLM enthusiast Author of:
● https://bratanic-tomaz.medium.com/
● https://twitter.com/tb_tomaz
● https://github.com/tomasonjo
● bratanic.tomaz@gmail.com
Special discount code for 40% off: ctwnodes40
http://mng.bz/vPnM

3
LLMs are great, but:
• Knowledge
cutoff date
• Hallucinations
• Lack of
information
source
• Lack of
domain-specific
or private
information

4
● Generating training
data is hard and
expensive
● Doesn’t solve
knowledge cutoff
issue, only pushed
it to a later date
● No data sources
● Hard to debug, is
the information
coming from
foundation training
set or from fine-
tuning set or is it
hallucinated?
Supervised fine-tuning an LLM
From Google documentations

5
● Providing relevant
context in prompt
● Avoid relying on
facts provided by
LLMs
● Increased
transparency due to
provided sources
● The answer is as
good as the
provided context
Retrieval-augmented generation
(in-context passing)

Retrieval-augmented generation (in-context passing) flow

Chat with your PDF application

Vector similarity search
Given a question, find the most relevant documents based on a similarity metric (such as
Cosine Similarity) between vector of the question and vectors of contents.
Moving from keyword search to similarity (semantic) search.
Q: what is text
embedding?
abstractId similarity
456 0.923445
22 0.892114
… ...
Top K by similarity

Vector index in Neo4j - Added in 5.11
Define vector index Query vector index

Chat with your PDF - GenAI Stack
https://github.com/docker/genai-stack
PDF Input
User question
Generated answer

Chat with your PDF - GenAI Stack
https://github.com/docker/genai-stack
Neo4j LLM integrations with
LangChain and LlamaIndex can get
you started in less than 5 minutes

13
Knowledge graphs
Go beyond unstructured text in your LLM applications

14
● Consists of nodes
and relationships
● Nodes are used to
represent entities,
concepts, and more
● Relationships
describe the
context and how it
is connected to
other nodes
Knowledge graph

Retrieval-augmented generation flow for structured
information using generated Cypher statements
https://towardsdatascience.com/langchain-has-added-cypher-search-cb9d821120d5

16
● Pass the schema
information to LLM
along with the
question to
generate a
database query
● Can be enhanced
by providing
specific query
examples in the
prompt
Using LLMs to generate queries

Few-shot example for Cypher generation
Also useful for
translating domain-
specific lingo to
Cypher

Answer generating chain
Giving an LLM presentation
with a pop quiz at the end.

19
● Using vector or full
text index to map
values or entities
from the user
question to
database property
values
● Using vector
similarity search to
find most relevant
few-shot cypher
examples allows
you to scale to
more examples
Introducing dynamic prompting
https://github.com/tomasonjo/streamlit-neo4j-hackathon

Multi-hop question answering
Take for example the following question:
Did any of the former OpenAI employees start
their own company?
Could be broken into two sub questions:
● Who are the former employees of OpenAI?
● Did any of them start their own company?
Typical recommendation is to use vector similarity
search in combination with agents

21
Potential problems
● Repeated information
in the top N
documents
● Hard to define ideal
N number of
retrieved documents
● Potentially a lot of
steps in the question
generation process
Multi-hop question answering

22
● Move the bulk work
from query time to
ingestion time
● Don’t worry about
repeated
information
● Don’t worry about
determining the
ideal K number of
retrieved
documents
Multi-hop question answering - a
better way
https://blog.langchain.dev/constructing-knowledge-graphs-from-text-using-openai-functions/

Real-time analytics with
knowledge graph
Take for example the following question:
How many new customers we had last week?
Which services depend indirectly on the
authorization service?
Vector similarity search doesn’t natively
handle aggregations, transformation,
graph analytics, etc…
However, a graph database like Neo4j
offers a vast possibility of traditional and
graph analytics through the use of the
graph-based structured query language
Cypher and Graph Data Science plugin

24
● Which products will
be affected if
particular system
goes down?
● Who owns a
particular system?
● Are there any active
systems that are
not part of a
product (could be
deprecated)?
● Are there any
existing tasks
related to particular
bug in authorization
service?
Real-time analytics - Microservices
(dev-ops) graph
https://blog.langchain.dev/using-a-knowledge-graph-to-implement-a-devops-rag-application/

25
● Future of LLM
applications is the
combination of
structured and
unstructured data
● Explicit links
between structured
and unstructured
information give the
ability to perform
complex “metadata”
filtering
Combining structured and
unstructured information
Unstructured data

Questions
Tomaž Bratanič - Graph & LLM enthusiast Author of:
Book raffle:
https://forms.gle/HKWAaSnmzawYtVqr6
● https://bratanic-tomaz.medium.com/
● https://twitter.com/tb_tomaz
● https://github.com/tomasonjo
● bratanic.tomaz@gmail.com
Special discount code for 40% off: ctwnodes40

Nodes 2023 - Knowledge graph based chatbot.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Nodes 2023 - Knowledge graph based chatbot.pptx

Similar to Nodes 2023 - Knowledge graph based chatbot.pptx (20)

Recently uploaded

Recently uploaded (20)

Nodes 2023 - Knowledge graph based chatbot.pptx