RAG Pipelines with Real-Time data Cloudera

RAG pipelines with real time data

Named a Visionary in
the 2024 Gartner®
Magic Quadrant™ for
Data Science and
Machine Learning
Platforms.

Rapidly deploy trusted AI by
bringing any model to your
secured and governed data.

Rapidly deploy trusted
AI by bringing any
model to your
secured and
governed data.
Reduce cost, complexity,
and risk with AI at scale,
across any cloud or data
center.

Rapidly deploy trusted
AI by bringing any
model to your
secured and
governed data.
Reduce cost,
complexity, and risk
with AI at scale,
across any cloud or
data center.
Securely protect,
streamline, and deliver
data and AI quickly
with an open data
lakehouse.

EXABYTES
of data under management

RAG pipelines with real-time data

● Access information from external
sources like documents & DBs.
● Give more accurate and relevant
responses.
● Answer questions grounded in
speciﬁc knowledge.
● Provide utility beyond the limitations
of their initial training data.
Gives LLMs the ability to:
RAG pipelines

Partitioning Chunking Embedding
VectorDB
persistence
Data to VectorDB

Partitioning Chunking Embedding
VectorDB
persistence
Data to VectorDB
RAG query
Embedding
Similarity
Search
LLM
Prompt
LLM
Completion

● Usually a background process without
direct user interaction.
● Generally involves a series of steps to
process data.
● Often stateless, focusing on the ﬂow of
data rather than its state.
RAG pipelines
Characteristics:

RAG pipelines with real-time data
Characteristics:
● Often suggests instantaneous, there is a spectrum of acceptable
latency.
● Can be continuously streaming or sporadic
● Usually dictated by the trade off of price - performance - security

RAG
101
RAG
201
RAG
301
RAG
401

RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.

RAG
101
● Understand the context of a query -
not just a lexical search

RAG
101
● Enable multi-hop “machine-reasoning”

RAG
101
● Enable multi-hop “machine-reasoning”
● “Machine-reasoning” traceability

● RAG projects are both Statistical
and Machine Learning projects
RAG
201

Careful what you ask for…
Was there a [Cosine Similarity] distance result threshold
where I could programmatically disregard the query as
irrelevant?
“Will the Cleveland Browns win the
AFC North?”
Returned a result with a distance of 0.0141!
RAG
201

Careful what you ask for…
Was there a [Cosine Similarity] distance result threshold
where I could programmatically disregard the query as
irrelevant?
“Will the Cleveland Browns win the
AFC North?”
Returned a result with a distance of 0.0141!
“...and select "Align vertically" to
achieve these results”
RAG
201

● RAG projects are Machine Learning
projects
● #1 Machine Learning projects rely
on predictions not assertions. RAG
201

projects
● #2 Go ahead and anthropomorphize
your solution.
RAG
201
● #1 Machine Learning projects rely
on predictions not assertions.

RAG
201
projects
● Machine Learning projects rely on
prediction.
● Go ahead and anthropomorphize
your solution.
● #3 Compute optimization - Is it time for you
to adopt a hybrid inference design
strategy?

Hyrid Inference
In the 2010’s microservices become
the defacto standard architecture
for application development.
A lesser known architectural
pattern that emerged alongside
microservices was
Polyglot persistence

Polyglot Persistence
Blob Key-Value
Relational DB

Embedding
Similarity
Search
LLM
Prompt
LLM
Completion
User
Query
RAG query

User
Query
Embedding
Semantic
Similarity
LLM
Prompt/Completion
Sentence
Transformers
Milvus
OS LLM

User
Query
OS LLM
Sentence
Transformers
Milvus

RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
DENSE SPARSE
Very few zeros
or Null values
Each element
of the data can
change the
meaning of the
other elements
High
dimensionality
or a large
proportion of
Zeros or Nulls

RAG
301
model
DENSE SPARSE
Novels Categorical Data
Sensor Data
Graph Data
Technical Docs
Email

RAG
301
model
DENSE SPARSE
Apache NiFi is a
dataﬂow system based
on the concepts of
ﬂow-based
programming. It
supports powerful and
scalable directed
graphs of data routing,
transformation, and
system mediation logic.

RAG
301
model
Partitioning - The process
of dividing your data into
smaller more manageable
parts - when necessary

RAG
301
model
Chunking - Bundling
individual partitions
together based on the
content of the elements
● Chunk by Element
● Chunk by Section
● Chunk by Max. Seq. Length

Milvus VectorDB conﬁgurations
RAG
401

RAG
301
● Milvus VectorDB collection schema

Milvus Hybrid Search (with Keywords)

Milvus Hybrid Search (with Keywords)
Hybrid search is ideal for complex situations demanding high
accuracy.
Hybrid can be >1 vector similarity search and >1 lexical search
Hybrid can also be >1 vector similarity search on vector indices

Milvus Multi-trip query
● First query returns id of single row from Vector search (closest Euclidean
distance)
● Second query returns all rows with
matching section value.

Leverage DataLakes to avoid duplicates or silos
● Raw Text will not always be in the VectorDB

GateKeepers are NOT optional
RAG
401

GateKeepers are NOT optional
RAG
401
System Prompts should be immutable
Prompt injection attacks will evolve
rapidly

New RAG based ReadyFlows - Powered by Apache NiFi 2.0

https://zilliz.com/resources/analyst-report/zilliz-forrester-wave-vector-database-report
Zilliz Named a Leader in Vector Database Providers
Looking for the right Vector Database solution for your AI applications?
Choosing a vector database that allows you to efficiently manage and search large-scale
vector data for AI applications can be challenging.
Forrester assessed the most significant vector database providers in the market. Zilliz stood
out for its:
● Cloud-native scalability for handling massive vector datasets
● Lightning-fast search capabilities for real-time AI applications
● Robust open-source foundation with Milvus
● Exceptional technical support and reliability
Forrester notes that Zilliz "is at the forefront of innovation, delivering exceptional speed and
efficiency in vector processing and search to support real-time AI applications."
Find out why Forrester named Zilliz a Leader and how we can accelerate your AI initiatives.

RAG Pipelines with Real-Time data Cloudera

More Related Content

Similar to RAG Pipelines with Real-Time data Cloudera

More from Zilliz

Recently uploaded

RAG Pipelines with Real-Time data Cloudera