RAG pipelines with real time data
Named a Visionary in
the 2024 Gartner®
Magic Quadrant™ for
Data Science and
Machine Learning
Platforms.
Rapidly deploy trusted AI by
bringing any model to your
secured and governed data.
Rapidly deploy trusted
AI by bringing any
model to your
secured and
governed data.
Reduce cost, complexity,
and risk with AI at scale,
across any cloud or data
center.
Rapidly deploy trusted
AI by bringing any
model to your
secured and
governed data.
Reduce cost,
complexity, and risk
with AI at scale,
across any cloud or
data center.
Securely protect,
streamline, and deliver
data and AI quickly
with an open data
lakehouse.
EXABYTES
of data under management
RAG pipelines with real-time data
● Access information from external
sources like documents & DBs.
● Give more accurate and relevant
responses.
● Answer questions grounded in
specific knowledge.
● Provide utility beyond the limitations
of their initial training data.
Gives LLMs the ability to:
RAG pipelines
Partitioning Chunking Embedding
VectorDB
persistence
Data to VectorDB
Partitioning Chunking Embedding
VectorDB
persistence
Data to VectorDB
RAG query
Embedding
Similarity
Search
LLM
Prompt
LLM
Completion
● Usually a background process without
direct user interaction.
● Generally involves a series of steps to
process data.
● Often stateless, focusing on the flow of
data rather than its state.
RAG pipelines
Characteristics:
RAG pipelines with real-time data
Characteristics:
● Often suggests instantaneous, there is a spectrum of acceptable
latency.
● Can be continuously streaming or sporadic
● Usually dictated by the trade off of price - performance - security
RAG
101
RAG
201
RAG
301
RAG
401
RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.
RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.
RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.
● Understand the context of a query -
not just a lexical search
RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.
● Understand the context of a query -
not just a lexical search
● Enable multi-hop “machine-reasoning”
RAG
101
● Currently the best strategy to prevent
Confabulations and fabrications.
● Understand the context of a query -
not just a lexical search
● Enable multi-hop “machine-reasoning”
● “Machine-reasoning” traceability
● RAG projects are both Statistical
and Machine Learning projects
RAG
201
● RAG projects are both Statistical
and Machine Learning projects
Careful what you ask for…
Was there a [Cosine Similarity] distance result threshold
where I could programmatically disregard the query as
irrelevant?
“Will the Cleveland Browns win the
AFC North?”
Returned a result with a distance of 0.0141!
RAG
201
Careful what you ask for…
Was there a [Cosine Similarity] distance result threshold
where I could programmatically disregard the query as
irrelevant?
“Will the Cleveland Browns win the
AFC North?”
Returned a result with a distance of 0.0141!
“...and select "Align vertically" to
achieve these results”
RAG
201
● RAG projects are both Statistical
and Machine Learning projects
● RAG projects are Machine Learning
projects
● #1 Machine Learning projects rely
on predictions not assertions. RAG
201
● RAG projects are Machine Learning
projects
● #2 Go ahead and anthropomorphize
your solution.
RAG
201
● #1 Machine Learning projects rely
on predictions not assertions.
RAG
201
● RAG projects are Machine Learning
projects
● Machine Learning projects rely on
prediction.
● Go ahead and anthropomorphize
your solution.
● #3 Compute optimization - Is it time for you
to adopt a hybrid inference design
strategy?
Hyrid Inference
In the 2010’s microservices become
the defacto standard architecture
for application development.
A lesser known architectural
pattern that emerged alongside
microservices was
Polyglot persistence
Polyglot Persistence
Blob Key-Value
Relational DB
Embedding
Similarity
Search
LLM
Prompt
LLM
Completion
User
Query
RAG query
User
Query
Embedding
Semantic
Similarity
LLM
Prompt/Completion
Sentence
Transformers
Milvus
OS LLM
User
Query
OS LLM
Sentence
Transformers
Milvus
RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
DENSE SPARSE
Very few zeros
or Null values
Each element
of the data can
change the
meaning of the
other elements
High
dimensionality
or a large
proportion of
Zeros or Nulls
RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
DENSE SPARSE
Novels Categorical Data
Sensor Data
Graph Data
Technical Docs
Email
RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
DENSE SPARSE
Apache NiFi is a
dataflow system based
on the concepts of
flow-based
programming. It
supports powerful and
scalable directed
graphs of data routing,
transformation, and
system mediation logic.
RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
Partitioning - The process
of dividing your data into
smaller more manageable
parts - when necessary
RAG
301
● Data Type informs Partitioning and
Chunking strategy as well as Embedding
model
Chunking - Bundling
individual partitions
together based on the
content of the elements
● Chunk by Element
● Chunk by Section
● Chunk by Max. Seq. Length
RAG
301
RAG
301
RAG
301
Milvus VectorDB configurations
RAG
401
RAG
301
● Milvus VectorDB collection schema
Milvus Hybrid Search (with Keywords)
Milvus Hybrid Search (with Keywords)
Hybrid search is ideal for complex situations demanding high
accuracy.
Hybrid can be >1 vector similarity search and >1 lexical search
Hybrid can also be >1 vector similarity search on vector indices
Milvus Multi-trip query
Milvus Multi-trip query
● First query returns id of single row from Vector search (closest Euclidean
distance)
● Second query returns all rows with
matching section value.
Leverage DataLakes to avoid duplicates or silos
● Raw Text will not always be in the VectorDB
GateKeepers are NOT optional
RAG
401
GateKeepers are NOT optional
RAG
401
GateKeepers are NOT optional
RAG
401
System Prompts should be immutable
Prompt injection attacks will evolve
rapidly
GateKeepers are NOT optional
RAG
401
2.9
New RAG based ReadyFlows - Powered by Apache NiFi 2.0
https://zilliz.com/resources/analyst-report/zilliz-forrester-wave-vector-database-report
Zilliz Named a Leader in Vector Database Providers
Looking for the right Vector Database solution for your AI applications?
Choosing a vector database that allows you to efficiently manage and search large-scale
vector data for AI applications can be challenging.
Forrester assessed the most significant vector database providers in the market. Zilliz stood
out for its:
● Cloud-native scalability for handling massive vector datasets
● Lightning-fast search capabilities for real-time AI applications
● Robust open-source foundation with Milvus
● Exceptional technical support and reliability
Forrester notes that Zilliz "is at the forefront of innovation, delivering exceptional speed and
efficiency in vector processing and search to support real-time AI applications."
Find out why Forrester named Zilliz a Leader and how we can accelerate your AI initiatives.
Thank you! - Questions?

RAG Pipelines with Real-Time data Cloudera

  • 1.
    RAG pipelines withreal time data
  • 2.
    Named a Visionaryin the 2024 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms.
  • 3.
    Rapidly deploy trustedAI by bringing any model to your secured and governed data.
  • 4.
    Rapidly deploy trusted AIby bringing any model to your secured and governed data. Reduce cost, complexity, and risk with AI at scale, across any cloud or data center.
  • 5.
    Rapidly deploy trusted AIby bringing any model to your secured and governed data. Reduce cost, complexity, and risk with AI at scale, across any cloud or data center. Securely protect, streamline, and deliver data and AI quickly with an open data lakehouse.
  • 7.
  • 8.
    RAG pipelines withreal-time data
  • 9.
    ● Access informationfrom external sources like documents & DBs. ● Give more accurate and relevant responses. ● Answer questions grounded in specific knowledge. ● Provide utility beyond the limitations of their initial training data. Gives LLMs the ability to: RAG pipelines
  • 10.
  • 11.
    Partitioning Chunking Embedding VectorDB persistence Datato VectorDB RAG query Embedding Similarity Search LLM Prompt LLM Completion
  • 12.
    ● Usually abackground process without direct user interaction. ● Generally involves a series of steps to process data. ● Often stateless, focusing on the flow of data rather than its state. RAG pipelines Characteristics:
  • 13.
    RAG pipelines withreal-time data Characteristics: ● Often suggests instantaneous, there is a spectrum of acceptable latency. ● Can be continuously streaming or sporadic ● Usually dictated by the trade off of price - performance - security
  • 14.
  • 15.
    RAG 101 ● Currently thebest strategy to prevent Confabulations and fabrications.
  • 16.
    RAG 101 ● Currently thebest strategy to prevent Confabulations and fabrications.
  • 17.
    RAG 101 ● Currently thebest strategy to prevent Confabulations and fabrications. ● Understand the context of a query - not just a lexical search
  • 18.
    RAG 101 ● Currently thebest strategy to prevent Confabulations and fabrications. ● Understand the context of a query - not just a lexical search ● Enable multi-hop “machine-reasoning”
  • 19.
    RAG 101 ● Currently thebest strategy to prevent Confabulations and fabrications. ● Understand the context of a query - not just a lexical search ● Enable multi-hop “machine-reasoning” ● “Machine-reasoning” traceability
  • 20.
    ● RAG projectsare both Statistical and Machine Learning projects RAG 201
  • 21.
    ● RAG projectsare both Statistical and Machine Learning projects Careful what you ask for… Was there a [Cosine Similarity] distance result threshold where I could programmatically disregard the query as irrelevant? “Will the Cleveland Browns win the AFC North?” Returned a result with a distance of 0.0141! RAG 201
  • 22.
    Careful what youask for… Was there a [Cosine Similarity] distance result threshold where I could programmatically disregard the query as irrelevant? “Will the Cleveland Browns win the AFC North?” Returned a result with a distance of 0.0141! “...and select "Align vertically" to achieve these results” RAG 201 ● RAG projects are both Statistical and Machine Learning projects
  • 23.
    ● RAG projectsare Machine Learning projects ● #1 Machine Learning projects rely on predictions not assertions. RAG 201
  • 24.
    ● RAG projectsare Machine Learning projects ● #2 Go ahead and anthropomorphize your solution. RAG 201 ● #1 Machine Learning projects rely on predictions not assertions.
  • 25.
    RAG 201 ● RAG projectsare Machine Learning projects ● Machine Learning projects rely on prediction. ● Go ahead and anthropomorphize your solution. ● #3 Compute optimization - Is it time for you to adopt a hybrid inference design strategy?
  • 26.
    Hyrid Inference In the2010’s microservices become the defacto standard architecture for application development. A lesser known architectural pattern that emerged alongside microservices was Polyglot persistence
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
    RAG 301 ● Data Typeinforms Partitioning and Chunking strategy as well as Embedding model DENSE SPARSE Very few zeros or Null values Each element of the data can change the meaning of the other elements High dimensionality or a large proportion of Zeros or Nulls
  • 32.
    RAG 301 ● Data Typeinforms Partitioning and Chunking strategy as well as Embedding model DENSE SPARSE Novels Categorical Data Sensor Data Graph Data Technical Docs Email
  • 33.
    RAG 301 ● Data Typeinforms Partitioning and Chunking strategy as well as Embedding model DENSE SPARSE Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
  • 34.
    RAG 301 ● Data Typeinforms Partitioning and Chunking strategy as well as Embedding model Partitioning - The process of dividing your data into smaller more manageable parts - when necessary
  • 35.
    RAG 301 ● Data Typeinforms Partitioning and Chunking strategy as well as Embedding model Chunking - Bundling individual partitions together based on the content of the elements ● Chunk by Element ● Chunk by Section ● Chunk by Max. Seq. Length
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
    RAG 301 ● Milvus VectorDBcollection schema
  • 41.
    Milvus Hybrid Search(with Keywords)
  • 42.
    Milvus Hybrid Search(with Keywords) Hybrid search is ideal for complex situations demanding high accuracy. Hybrid can be >1 vector similarity search and >1 lexical search Hybrid can also be >1 vector similarity search on vector indices
  • 43.
  • 44.
    Milvus Multi-trip query ●First query returns id of single row from Vector search (closest Euclidean distance) ● Second query returns all rows with matching section value.
  • 45.
    Leverage DataLakes toavoid duplicates or silos ● Raw Text will not always be in the VectorDB
  • 46.
    GateKeepers are NOToptional RAG 401
  • 47.
    GateKeepers are NOToptional RAG 401
  • 48.
    GateKeepers are NOToptional RAG 401 System Prompts should be immutable Prompt injection attacks will evolve rapidly
  • 49.
    GateKeepers are NOToptional RAG 401
  • 50.
  • 51.
    New RAG basedReadyFlows - Powered by Apache NiFi 2.0
  • 53.
    https://zilliz.com/resources/analyst-report/zilliz-forrester-wave-vector-database-report Zilliz Named aLeader in Vector Database Providers Looking for the right Vector Database solution for your AI applications? Choosing a vector database that allows you to efficiently manage and search large-scale vector data for AI applications can be challenging. Forrester assessed the most significant vector database providers in the market. Zilliz stood out for its: ● Cloud-native scalability for handling massive vector datasets ● Lightning-fast search capabilities for real-time AI applications ● Robust open-source foundation with Milvus ● Exceptional technical support and reliability Forrester notes that Zilliz "is at the forefront of innovation, delivering exceptional speed and efficiency in vector processing and search to support real-time AI applications." Find out why Forrester named Zilliz a Leader and how we can accelerate your AI initiatives.
  • 55.
    Thank you! -Questions?