SlideShare a Scribd company logo
1 of 34
Download to read offline
1 | © Copyright 2024 Zilliz
1
Yujian Tang | Zilliz
Multilingual RAG
2 | © Copyright 2024 Zilliz
2
Yujian Tang
Senior Developer Advocate, Zilliz
yujian@zilliz.com
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker
3 | © Copyright 2024 Zilliz
3
01 RAG Review
CONTENTS
03
04 Demo
02 LLMs and Embedding Models
Vector Databases
4 | © Copyright 2024 Zilliz
4
01 RAG Review
5 | © Copyright 2024 Zilliz
5
RAG
RAG
Inject your data via a vector
database like Milvus/Zilliz
Primary Use Case
- Factual Recall
- Forced Data Injection
- Cost Optimization
6 | © Copyright 2024 Zilliz
6
Query LLM
Milvus
Your Data
Embedding
Model
7 | © Copyright 2024 Zilliz
7
02 LLMs and Embedding Models
8 | © Copyright 2024 Zilliz
8
How did LLMs come about?
9 | © Copyright 2024 Zilliz
9
A Basic Neural Net
10 | © Copyright 2024 Zilliz
10
A Recurrent Neural Network
11 | © Copyright 2024 Zilliz
11
A Transformer Architecture
12 | © Copyright 2024 Zilliz
12
GPT
13 | © Copyright 2024 Zilliz
13
What about Embedding Models?
14 | © Copyright 2024 Zilliz
14
Vector
Databases
Deep Learning Models w/o Last Layer
15 | © Copyright 2024 Zilliz
15
LLMs
- Large models
- Generate text
- Reasoning capability
- Based on
transformers
Embedding Models
- Smaller
- Non predictive
- Non generative
16 | © Copyright 2024 Zilliz
16
03 Vector Databases
17 | © Copyright 2024 Zilliz
17
Find Semantically Similar Data
Apple made profits of $97 Billion in 2023
I like to eat apple pie for profit in 2023
Apple’s bottom line increased by record numbers in 2023
18 | © Copyright 2024 Zilliz
18
But wait! There’s more!
19 | © Copyright 2024 Zilliz
19
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]
Man = [0.5, 0.2]
20 | © Copyright 2024 Zilliz
20
Similarity metrics are ways to measure distance in
vector space
21 | © Copyright 2024 Zilliz
21
Vector Similarity Metric: L2 (Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)2
+ (0.9-0.7)2
= √(0.2)2
+ (0.2)2
= √0.04 + 0.04
= √0.08 ≅ 0.28
22 | © Copyright 2024 Zilliz
22
Vector Similarity Metric: Inner Product (IP)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78
23 | © Copyright 2024 Zilliz
23
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Metric: Cosine
𝚹
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.32
+0.92
* √0.52
+0.72
= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03
24 | © Copyright 2024 Zilliz
24
Vector Similarity Metrics
Euclidean - Spatial distance
Cosine - Orientational distance
Inner Product - Both
With normalized vectors, IP = Cosine
25 | © Copyright 2024 Zilliz
25
Indexes organize the way we access our data
26 | © Copyright 2024 Zilliz
26
Inverted File Index
Source:
https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
27 | © Copyright 2024 Zilliz
27
Hierarchical Navigable Small Worlds (HNSW)
Source:
https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
28 | © Copyright 2024 Zilliz
28
Scalar Quantization (SQ)
29 | © Copyright 2024 Zilliz
29
Product Quantization
Source:
https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
30 | © Copyright 2024 Zilliz
30
Indexes Overview
- IVF = Intuitive, medium memory, performant
- HNSW = Graph based, high memory, highly performant
- Flat = brute force
- SQ = bucketize across one dimension, accuracy x
memory tradeoff
- PQ = bucketize across two dimensions, more accuracy x
memory tradeoff
31 | © Copyright 2024 Zilliz
31
04 Demo
32 | © Copyright 2024 Zilliz
32
Query LLM
Language Data
Embedding
Model(s)
33 | © Copyright 2024 Zilliz
33
RAG
34 | © Copyright 2024 Zilliz
34
Start building
with Zilliz Cloud today!
zilliz.com/cloud

More Related Content

Similar to Introduction to Multilingual Retrieval Augmented Generation (RAG)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Software Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZSoftware Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZAmir Zuker
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j
 
Roadmap y Novedades de producto
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de productoNeo4j
 
Neo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des GraphesNeo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des GraphesNeo4j
 
The Art of the Possible with Graph Technology
The Art of the Possible with Graph TechnologyThe Art of the Possible with Graph Technology
The Art of the Possible with Graph TechnologyNeo4j
 
Keynote: Art of the Possible - Moore
Keynote: Art of the Possible - MooreKeynote: Art of the Possible - Moore
Keynote: Art of the Possible - MooreNeo4j
 
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxThe art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxNeo4j
 
Exploring IoT Edge
Exploring IoT EdgeExploring IoT Edge
Exploring IoT EdgeCodit
 
Neo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph TechnologyNeo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph TechnologyNeo4j
 
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day VMworld
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackInfluxData
 
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesUsing NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesDavid McGeough
 
The Future of Service Mesh
The Future of Service MeshThe Future of Service Mesh
The Future of Service MeshAll Things Open
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphNeo4j
 
Realize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesThousandEyes
 
Validation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FMEValidation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FMEGIM_nv
 
Paul Huppertz Cloud Computing From System Design To Service Composing
Paul Huppertz  Cloud Computing    From System Design To Service ComposingPaul Huppertz  Cloud Computing    From System Design To Service Composing
Paul Huppertz Cloud Computing From System Design To Service ComposingCloudcamp
 

Similar to Introduction to Multilingual Retrieval Augmented Generation (RAG) (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Software Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZSoftware Architecture in The Multi-Cloud Era AZ
Software Architecture in The Multi-Cloud Era AZ
 
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptxNeo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
Neo4j & AWS Bedrock workshop at GraphSummit London 14 Nov 2023.pptx
 
Roadmap y Novedades de producto
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de producto
 
Neo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des GraphesNeo4j : L’art des Possibles avec la Technologie des Graphes
Neo4j : L’art des Possibles avec la Technologie des Graphes
 
The Art of the Possible with Graph Technology
The Art of the Possible with Graph TechnologyThe Art of the Possible with Graph Technology
The Art of the Possible with Graph Technology
 
Keynote: Art of the Possible - Moore
Keynote: Art of the Possible - MooreKeynote: Art of the Possible - Moore
Keynote: Art of the Possible - Moore
 
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxThe art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
 
Exploring IoT Edge
Exploring IoT EdgeExploring IoT Edge
Exploring IoT Edge
 
Neo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph TechnologyNeo4j Keynote: The Art of the Possible with Graph Technology
Neo4j Keynote: The Art of the Possible with Graph Technology
 
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
VMworld 2013: How to Build a Hybrid Cloud in Less than a Day
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesUsing NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
 
The Future of Service Mesh
The Future of Service MeshThe Future of Service Mesh
The Future of Service Mesh
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
 
Realize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyes
 
Validation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FMEValidation and visualization of Revit BIM Models with FME
Validation and visualization of Revit BIM Models with FME
 
Paul Huppertz Cloud Computing From System Design To Service Composing
Paul Huppertz  Cloud Computing    From System Design To Service ComposingPaul Huppertz  Cloud Computing    From System Design To Service Composing
Paul Huppertz Cloud Computing From System Design To Service Composing
 

More from Zilliz

Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Zilliz
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Zilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz
 
Integrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomIntegrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomZilliz
 
Build streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizBuild streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizZilliz
 
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGVoyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...Zilliz
 
Voyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationVoyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationZilliz
 
Chat with your data, privately and locally
Chat with your data, privately and locallyChat with your data, privately and locally
Chat with your data, privately and locallyZilliz
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseZilliz
 

More from Zilliz (13)

Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Zilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in ML
 
Integrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomIntegrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with Floom
 
Build streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizBuild streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and Zilliz
 
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGVoyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
 
Voyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationVoyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented Generation
 
Chat with your data, privately and locally
Chat with your data, privately and locallyChat with your data, privately and locally
Chat with your data, privately and locally
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 release
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideStefan Dietze
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 

Recently uploaded (20)

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 

Introduction to Multilingual Retrieval Augmented Generation (RAG)

  • 1. 1 | © Copyright 2024 Zilliz 1 Yujian Tang | Zilliz Multilingual RAG
  • 2. 2 | © Copyright 2024 Zilliz 2 Yujian Tang Senior Developer Advocate, Zilliz yujian@zilliz.com https://www.linkedin.com/in/yujiantang https://www.twitter.com/yujian_tang Speaker
  • 3. 3 | © Copyright 2024 Zilliz 3 01 RAG Review CONTENTS 03 04 Demo 02 LLMs and Embedding Models Vector Databases
  • 4. 4 | © Copyright 2024 Zilliz 4 01 RAG Review
  • 5. 5 | © Copyright 2024 Zilliz 5 RAG RAG Inject your data via a vector database like Milvus/Zilliz Primary Use Case - Factual Recall - Forced Data Injection - Cost Optimization
  • 6. 6 | © Copyright 2024 Zilliz 6 Query LLM Milvus Your Data Embedding Model
  • 7. 7 | © Copyright 2024 Zilliz 7 02 LLMs and Embedding Models
  • 8. 8 | © Copyright 2024 Zilliz 8 How did LLMs come about?
  • 9. 9 | © Copyright 2024 Zilliz 9 A Basic Neural Net
  • 10. 10 | © Copyright 2024 Zilliz 10 A Recurrent Neural Network
  • 11. 11 | © Copyright 2024 Zilliz 11 A Transformer Architecture
  • 12. 12 | © Copyright 2024 Zilliz 12 GPT
  • 13. 13 | © Copyright 2024 Zilliz 13 What about Embedding Models?
  • 14. 14 | © Copyright 2024 Zilliz 14 Vector Databases Deep Learning Models w/o Last Layer
  • 15. 15 | © Copyright 2024 Zilliz 15 LLMs - Large models - Generate text - Reasoning capability - Based on transformers Embedding Models - Smaller - Non predictive - Non generative
  • 16. 16 | © Copyright 2024 Zilliz 16 03 Vector Databases
  • 17. 17 | © Copyright 2024 Zilliz 17 Find Semantically Similar Data Apple made profits of $97 Billion in 2023 I like to eat apple pie for profit in 2023 Apple’s bottom line increased by record numbers in 2023
  • 18. 18 | © Copyright 2024 Zilliz 18 But wait! There’s more!
  • 19. 19 | © Copyright 2024 Zilliz 19 Semantic Similarity Image from Sutor et al Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Man = [0.5, 0.2] Queen - Woman + Man = King Queen = [0.3, 0.9] - Woman = [0.3, 0.4] [0.0, 0.5] + Man = [0.5, 0.2] King = [0.5, 0.7] Man = [0.5, 0.2]
  • 20. 20 | © Copyright 2024 Zilliz 20 Similarity metrics are ways to measure distance in vector space
  • 21. 21 | © Copyright 2024 Zilliz 21 Vector Similarity Metric: L2 (Euclidean) Queen = [0.3, 0.9] King = [0.5, 0.7] d(Queen, King) = √(0.3-0.5)2 + (0.9-0.7)2 = √(0.2)2 + (0.2)2 = √0.04 + 0.04 = √0.08 ≅ 0.28
  • 22. 22 | © Copyright 2024 Zilliz 22 Vector Similarity Metric: Inner Product (IP) Queen = [0.3, 0.9] King = [0.5, 0.7] Queen · King = (0.3*0.5) + (0.9*0.7) = 0.15 + 0.63 = 0.78
  • 23. 23 | © Copyright 2024 Zilliz 23 Queen = [0.3, 0.9] King = [0.5, 0.7] Vector Similarity Metric: Cosine 𝚹 cos(Queen, King) = (0.3*0.5)+(0.9*0.7) √0.32 +0.92 * √0.52 +0.72 = 0.15+0.63 _ √0.9 * √0.74 = 0.78 _ √0.666 ≅ 0.03
  • 24. 24 | © Copyright 2024 Zilliz 24 Vector Similarity Metrics Euclidean - Spatial distance Cosine - Orientational distance Inner Product - Both With normalized vectors, IP = Cosine
  • 25. 25 | © Copyright 2024 Zilliz 25 Indexes organize the way we access our data
  • 26. 26 | © Copyright 2024 Zilliz 26 Inverted File Index Source: https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
  • 27. 27 | © Copyright 2024 Zilliz 27 Hierarchical Navigable Small Worlds (HNSW) Source: https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
  • 28. 28 | © Copyright 2024 Zilliz 28 Scalar Quantization (SQ)
  • 29. 29 | © Copyright 2024 Zilliz 29 Product Quantization Source: https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
  • 30. 30 | © Copyright 2024 Zilliz 30 Indexes Overview - IVF = Intuitive, medium memory, performant - HNSW = Graph based, high memory, highly performant - Flat = brute force - SQ = bucketize across one dimension, accuracy x memory tradeoff - PQ = bucketize across two dimensions, more accuracy x memory tradeoff
  • 31. 31 | © Copyright 2024 Zilliz 31 04 Demo
  • 32. 32 | © Copyright 2024 Zilliz 32 Query LLM Language Data Embedding Model(s)
  • 33. 33 | © Copyright 2024 Zilliz 33 RAG
  • 34. 34 | © Copyright 2024 Zilliz 34 Start building with Zilliz Cloud today! zilliz.com/cloud