SlideShare a Scribd company logo
1 of 34
Download to read offline
1 | © Copyright 2024 Zilliz
1
Yujian Tang | Zilliz
Multilingual RAG
2 | © Copyright 2024 Zilliz
2
Yujian Tang
Senior Developer Advocate, Zilliz
yujian@zilliz.com
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker
3 | © Copyright 2024 Zilliz
3
01 RAG Review
CONTENTS
03
04 Demo
02 LLMs and Embedding Models
Vector Databases
4 | © Copyright 2024 Zilliz
4
01 RAG Review
5 | © Copyright 2024 Zilliz
5
RAG
RAG
Inject your data via a vector
database like Milvus/Zilliz
Primary Use Case
- Factual Recall
- Forced Data Injection
- Cost Optimization
6 | © Copyright 2024 Zilliz
6
Query LLM
Milvus
Your Data
Embedding
Model
7 | © Copyright 2024 Zilliz
7
02 LLMs and Embedding Models
8 | © Copyright 2024 Zilliz
8
How did LLMs come about?
9 | © Copyright 2024 Zilliz
9
A Basic Neural Net
10 | © Copyright 2024 Zilliz
10
A Recurrent Neural Network
11 | © Copyright 2024 Zilliz
11
A Transformer Architecture
12 | © Copyright 2024 Zilliz
12
GPT
13 | © Copyright 2024 Zilliz
13
What about Embedding Models?
14 | © Copyright 2024 Zilliz
14
Vector
Databases
Deep Learning Models w/o Last Layer
15 | © Copyright 2024 Zilliz
15
LLMs
- Large models
- Generate text
- Reasoning capability
- Based on
transformers
Embedding Models
- Smaller
- Non predictive
- Non generative
16 | © Copyright 2024 Zilliz
16
03 Vector Databases
17 | © Copyright 2024 Zilliz
17
Find Semantically Similar Data
Apple made profits of $97 Billion in 2023
I like to eat apple pie for profit in 2023
Apple’s bottom line increased by record numbers in 2023
18 | © Copyright 2024 Zilliz
18
But wait! There’s more!
19 | © Copyright 2024 Zilliz
19
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]
Man = [0.5, 0.2]
20 | © Copyright 2024 Zilliz
20
Similarity metrics are ways to measure distance in
vector space
21 | © Copyright 2024 Zilliz
21
Vector Similarity Metric: L2 (Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)2
+ (0.9-0.7)2
= √(0.2)2
+ (0.2)2
= √0.04 + 0.04
= √0.08 ≅ 0.28
22 | © Copyright 2024 Zilliz
22
Vector Similarity Metric: Inner Product (IP)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78
23 | © Copyright 2024 Zilliz
23
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Metric: Cosine
𝚹
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.32
+0.92
* √0.52
+0.72
= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03
24 | © Copyright 2024 Zilliz
24
Vector Similarity Metrics
Euclidean - Spatial distance
Cosine - Orientational distance
Inner Product - Both
With normalized vectors, IP = Cosine
25 | © Copyright 2024 Zilliz
25
Indexes organize the way we access our data
26 | © Copyright 2024 Zilliz
26
Inverted File Index
Source:
https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
27 | © Copyright 2024 Zilliz
27
Hierarchical Navigable Small Worlds (HNSW)
Source:
https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
28 | © Copyright 2024 Zilliz
28
Scalar Quantization (SQ)
29 | © Copyright 2024 Zilliz
29
Product Quantization
Source:
https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
30 | © Copyright 2024 Zilliz
30
Indexes Overview
- IVF = Intuitive, medium memory, performant
- HNSW = Graph based, high memory, highly performant
- Flat = brute force
- SQ = bucketize across one dimension, accuracy x
memory tradeoff
- PQ = bucketize across two dimensions, more accuracy x
memory tradeoff
31 | © Copyright 2024 Zilliz
31
04 Demo
32 | © Copyright 2024 Zilliz
32
Query LLM
Language Data
Embedding
Model(s)
33 | © Copyright 2024 Zilliz
33
RAG
34 | © Copyright 2024 Zilliz
34
Start building
with Zilliz Cloud today!
zilliz.com/cloud

More Related Content

Similar to Introduction to Multilingual Retrieval Augmented Generation (RAG)

Paul Huppertz Cloud Computing From System Design To Service Composing
Paul Huppertz  Cloud Computing    From System Design To Service ComposingPaul Huppertz  Cloud Computing    From System Design To Service Composing
Paul Huppertz Cloud Computing From System Design To Service Composing
Cloudcamp
 

Similar to Introduction to Multilingual Retrieval Augmented Generation (RAG) (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Software Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZSoftware Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZ
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
 
Roadmap y Novedades de producto
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de producto
 
Neo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des GraphesNeo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des Graphes
 
The Art of the Possible with Graph Technology
The Art of the Possible with Graph TechnologyThe Art of the Possible with Graph Technology
The Art of the Possible with Graph Technology
 
Keynote: Art of the Possible - Moore
Keynote: Art of the Possible - MooreKeynote: Art of the Possible - Moore
Keynote: Art of the Possible - Moore
 
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxThe art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
 
Exploring IoT Edge
Exploring IoT EdgeExploring IoT Edge
Exploring IoT Edge
 
Neo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph TechnologyNeo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph Technology
 
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesUsing NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
 
The Future of Service Mesh
The Future of Service MeshThe Future of Service Mesh
The Future of Service Mesh
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
 
Realize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyes
 
Validation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FMEValidation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FME
 
Paul Huppertz Cloud Computing From System Design To Service Composing
Paul Huppertz  Cloud Computing    From System Design To Service ComposingPaul Huppertz  Cloud Computing    From System Design To Service Composing
Paul Huppertz Cloud Computing From System Design To Service Composing
 

More from Zilliz

More from Zilliz (15)

Advanced Retrieval Augmented Generation Techniques
Advanced Retrieval Augmented Generation TechniquesAdvanced Retrieval Augmented Generation Techniques
Advanced Retrieval Augmented Generation Techniques
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Zilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in ML
 
Integrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomIntegrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with Floom
 
Build streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizBuild streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and Zilliz
 
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGVoyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
 
Voyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationVoyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented Generation
 
Chat with your data, privately and locally
Chat with your data, privately and locallyChat with your data, privately and locally
Chat with your data, privately and locally
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 release
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 

Recently uploaded (20)

FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 

Introduction to Multilingual Retrieval Augmented Generation (RAG)

  • 1. 1 | © Copyright 2024 Zilliz 1 Yujian Tang | Zilliz Multilingual RAG
  • 2. 2 | © Copyright 2024 Zilliz 2 Yujian Tang Senior Developer Advocate, Zilliz yujian@zilliz.com https://www.linkedin.com/in/yujiantang https://www.twitter.com/yujian_tang Speaker
  • 3. 3 | © Copyright 2024 Zilliz 3 01 RAG Review CONTENTS 03 04 Demo 02 LLMs and Embedding Models Vector Databases
  • 4. 4 | © Copyright 2024 Zilliz 4 01 RAG Review
  • 5. 5 | © Copyright 2024 Zilliz 5 RAG RAG Inject your data via a vector database like Milvus/Zilliz Primary Use Case - Factual Recall - Forced Data Injection - Cost Optimization
  • 6. 6 | © Copyright 2024 Zilliz 6 Query LLM Milvus Your Data Embedding Model
  • 7. 7 | © Copyright 2024 Zilliz 7 02 LLMs and Embedding Models
  • 8. 8 | © Copyright 2024 Zilliz 8 How did LLMs come about?
  • 9. 9 | © Copyright 2024 Zilliz 9 A Basic Neural Net
  • 10. 10 | © Copyright 2024 Zilliz 10 A Recurrent Neural Network
  • 11. 11 | © Copyright 2024 Zilliz 11 A Transformer Architecture
  • 12. 12 | © Copyright 2024 Zilliz 12 GPT
  • 13. 13 | © Copyright 2024 Zilliz 13 What about Embedding Models?
  • 14. 14 | © Copyright 2024 Zilliz 14 Vector Databases Deep Learning Models w/o Last Layer
  • 15. 15 | © Copyright 2024 Zilliz 15 LLMs - Large models - Generate text - Reasoning capability - Based on transformers Embedding Models - Smaller - Non predictive - Non generative
  • 16. 16 | © Copyright 2024 Zilliz 16 03 Vector Databases
  • 17. 17 | © Copyright 2024 Zilliz 17 Find Semantically Similar Data Apple made profits of $97 Billion in 2023 I like to eat apple pie for profit in 2023 Apple’s bottom line increased by record numbers in 2023
  • 18. 18 | © Copyright 2024 Zilliz 18 But wait! There’s more!
  • 19. 19 | © Copyright 2024 Zilliz 19 Semantic Similarity Image from Sutor et al Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Man = [0.5, 0.2] Queen - Woman + Man = King Queen = [0.3, 0.9] - Woman = [0.3, 0.4] [0.0, 0.5] + Man = [0.5, 0.2] King = [0.5, 0.7] Man = [0.5, 0.2]
  • 20. 20 | © Copyright 2024 Zilliz 20 Similarity metrics are ways to measure distance in vector space
  • 21. 21 | © Copyright 2024 Zilliz 21 Vector Similarity Metric: L2 (Euclidean) Queen = [0.3, 0.9] King = [0.5, 0.7] d(Queen, King) = √(0.3-0.5)2 + (0.9-0.7)2 = √(0.2)2 + (0.2)2 = √0.04 + 0.04 = √0.08 ≅ 0.28
  • 22. 22 | © Copyright 2024 Zilliz 22 Vector Similarity Metric: Inner Product (IP) Queen = [0.3, 0.9] King = [0.5, 0.7] Queen · King = (0.3*0.5) + (0.9*0.7) = 0.15 + 0.63 = 0.78
  • 23. 23 | © Copyright 2024 Zilliz 23 Queen = [0.3, 0.9] King = [0.5, 0.7] Vector Similarity Metric: Cosine 𝚹 cos(Queen, King) = (0.3*0.5)+(0.9*0.7) √0.32 +0.92 * √0.52 +0.72 = 0.15+0.63 _ √0.9 * √0.74 = 0.78 _ √0.666 ≅ 0.03
  • 24. 24 | © Copyright 2024 Zilliz 24 Vector Similarity Metrics Euclidean - Spatial distance Cosine - Orientational distance Inner Product - Both With normalized vectors, IP = Cosine
  • 25. 25 | © Copyright 2024 Zilliz 25 Indexes organize the way we access our data
  • 26. 26 | © Copyright 2024 Zilliz 26 Inverted File Index Source: https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
  • 27. 27 | © Copyright 2024 Zilliz 27 Hierarchical Navigable Small Worlds (HNSW) Source: https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
  • 28. 28 | © Copyright 2024 Zilliz 28 Scalar Quantization (SQ)
  • 29. 29 | © Copyright 2024 Zilliz 29 Product Quantization Source: https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
  • 30. 30 | © Copyright 2024 Zilliz 30 Indexes Overview - IVF = Intuitive, medium memory, performant - HNSW = Graph based, high memory, highly performant - Flat = brute force - SQ = bucketize across one dimension, accuracy x memory tradeoff - PQ = bucketize across two dimensions, more accuracy x memory tradeoff
  • 31. 31 | © Copyright 2024 Zilliz 31 04 Demo
  • 32. 32 | © Copyright 2024 Zilliz 32 Query LLM Language Data Embedding Model(s)
  • 33. 33 | © Copyright 2024 Zilliz 33 RAG
  • 34. 34 | © Copyright 2024 Zilliz 34 Start building with Zilliz Cloud today! zilliz.com/cloud