More Related Content Similar to Introduction to Multilingual Retrieval Augmented Generation (RAG) (20) Introduction to Multilingual Retrieval Augmented Generation (RAG)1. 1 | © Copyright 2024 Zilliz
1
Yujian Tang | Zilliz
Multilingual RAG
2. 2 | © Copyright 2024 Zilliz
2
Yujian Tang
Senior Developer Advocate, Zilliz
yujian@zilliz.com
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker
3. 3 | © Copyright 2024 Zilliz
3
01 RAG Review
CONTENTS
03
04 Demo
02 LLMs and Embedding Models
Vector Databases
4. 4 | © Copyright 2024 Zilliz
4
01 RAG Review
5. 5 | © Copyright 2024 Zilliz
5
RAG
RAG
Inject your data via a vector
database like Milvus/Zilliz
Primary Use Case
- Factual Recall
- Forced Data Injection
- Cost Optimization
6. 6 | © Copyright 2024 Zilliz
6
Query LLM
Milvus
Your Data
Embedding
Model
7. 7 | © Copyright 2024 Zilliz
7
02 LLMs and Embedding Models
8. 8 | © Copyright 2024 Zilliz
8
How did LLMs come about?
9. 9 | © Copyright 2024 Zilliz
9
A Basic Neural Net
10. 10 | © Copyright 2024 Zilliz
10
A Recurrent Neural Network
11. 11 | © Copyright 2024 Zilliz
11
A Transformer Architecture
12. 12 | © Copyright 2024 Zilliz
12
GPT
13. 13 | © Copyright 2024 Zilliz
13
What about Embedding Models?
14. 14 | © Copyright 2024 Zilliz
14
Vector
Databases
Deep Learning Models w/o Last Layer
15. 15 | © Copyright 2024 Zilliz
15
LLMs
- Large models
- Generate text
- Reasoning capability
- Based on
transformers
Embedding Models
- Smaller
- Non predictive
- Non generative
16. 16 | © Copyright 2024 Zilliz
16
03 Vector Databases
17. 17 | © Copyright 2024 Zilliz
17
Find Semantically Similar Data
Apple made profits of $97 Billion in 2023
I like to eat apple pie for profit in 2023
Apple’s bottom line increased by record numbers in 2023
18. 18 | © Copyright 2024 Zilliz
18
But wait! There’s more!
19. 19 | © Copyright 2024 Zilliz
19
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]
Man = [0.5, 0.2]
20. 20 | © Copyright 2024 Zilliz
20
Similarity metrics are ways to measure distance in
vector space
21. 21 | © Copyright 2024 Zilliz
21
Vector Similarity Metric: L2 (Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)2
+ (0.9-0.7)2
= √(0.2)2
+ (0.2)2
= √0.04 + 0.04
= √0.08 ≅ 0.28
22. 22 | © Copyright 2024 Zilliz
22
Vector Similarity Metric: Inner Product (IP)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78
23. 23 | © Copyright 2024 Zilliz
23
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Metric: Cosine
𝚹
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.32
+0.92
* √0.52
+0.72
= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03
24. 24 | © Copyright 2024 Zilliz
24
Vector Similarity Metrics
Euclidean - Spatial distance
Cosine - Orientational distance
Inner Product - Both
With normalized vectors, IP = Cosine
25. 25 | © Copyright 2024 Zilliz
25
Indexes organize the way we access our data
26. 26 | © Copyright 2024 Zilliz
26
Inverted File Index
Source:
https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
27. 27 | © Copyright 2024 Zilliz
27
Hierarchical Navigable Small Worlds (HNSW)
Source:
https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
28. 28 | © Copyright 2024 Zilliz
28
Scalar Quantization (SQ)
29. 29 | © Copyright 2024 Zilliz
29
Product Quantization
Source:
https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
30. 30 | © Copyright 2024 Zilliz
30
Indexes Overview
- IVF = Intuitive, medium memory, performant
- HNSW = Graph based, high memory, highly performant
- Flat = brute force
- SQ = bucketize across one dimension, accuracy x
memory tradeoff
- PQ = bucketize across two dimensions, more accuracy x
memory tradeoff
31. 31 | © Copyright 2024 Zilliz
31
04 Demo
32. 32 | © Copyright 2024 Zilliz
32
Query LLM
Language Data
Embedding
Model(s)
33. 33 | © Copyright 2024 Zilliz
33
RAG
34. 34 | © Copyright 2024 Zilliz
34
Start building
with Zilliz Cloud today!
zilliz.com/cloud