SlideShare a Scribd company logo
1 of 38
Download to read offline
1 | © Copyright 2023 Zilliz
1
Yujian Tang | Zilliz
Chunking, Embeddings, and
Vector Databases
2 | © Copyright 2023 Zilliz
2
Yujian Tang
Senior Developer Advocate, Zilliz
yujian@zilliz.com
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker
3 | © Copyright 2023 Zilliz
3
GPT-Cache
Key maintainer of the following Open-Source projects
Founded
Headquarters
Focus
2017
Redwood Shores, CA
Vector database company for enterprise-grade AI built on Milvus, the
popular open-source vector database that helps organizations quickly
create AI applications.
Zilliz at a Glance
4 | © Copyright 2023 Zilliz
4
01 Why RAG?
CONTENTS
03
04 Vector Databases
02 Chunking Considerations
Embedding Models and Strategies
5 | © Copyright 2023 Zilliz
5
01 RAG Overview
6 | © Copyright 2023 Zilliz
6
Basic Idea
You want to use your data with a large
language model
7 | © Copyright 2023 Zilliz
7
RAG vs Fine Tuning
RAG
Inject your data via a vector
database like Milvus/Zilliz
Query LLM
Milvus
Your Data
Primary Use Case
- Factual Recall
- Forced Data Injection
- Cost Optimization
8 | © Copyright 2023 Zilliz
8
RAG vs Fine Tuning
LLM
Fine Tuning
Augment an LLM by training it
on your data
Your Data
“New” LLM
Query
Primary Use Case
- Style transfer
- Domain specific usage
9 | © Copyright 2023 Zilliz
9
Takeaway
Use RAG to force the LLM to work with your data by injecting it
via a vector database like Milvus or Zilliz
10 | © Copyright 2023 Zilliz
10
02 Chunking
11 | © Copyright 2023 Zilliz
11
Chunking Considerations
Chunk Size
Chunk Overlap
Character Splitters
12 | © Copyright 2023 Zilliz
12
How Does Your Data Look?
Conversation
Data
Documentation
Data
Lecture or Q/A
Data
13 | © Copyright 2023 Zilliz
13
Examples
14 | © Copyright 2023 Zilliz
14
Examples
15 | © Copyright 2023 Zilliz
15
Examples
16 | © Copyright 2023 Zilliz
16
Examples
17 | © Copyright 2023 Zilliz
17
Examples
18 | © Copyright 2023 Zilliz
18
Examples
19 | © Copyright 2023 Zilliz
19
Your chunking strategy depends on what your data looks
like and what you need from it.
Takeaway:
20 | © Copyright 2023 Zilliz
20
03 Embeddings
21 | © Copyright 2023 Zilliz
21
Examining Embeddings
Picking a model
What to embed
Metadata
22 | © Copyright 2023 Zilliz
22
Embeddings Models
23 | © Copyright 2023 Zilliz
23
Embeddings Strategies
Level 1: Embedding Chunks Directly
Level 2: Embedding Sub and Super Chunks
Level 3: Incorporating Chunking and Non-Chunking Metadata
24 | © Copyright 2023 Zilliz
24
Metadata Examples
Chunking
- Paragraph position
- Section header
- Larger paragraph
- Sentence Number
- …
Non-Chunking
- Author
- Publisher
- Organization
- Role Based Access Control
- …
25 | © Copyright 2023 Zilliz
25
Your embeddings strategy depends on your accuracy,
cost, and use case needs
Takeaway:
26 | © Copyright 2023 Zilliz
26
04 Vector Databases
27 | © Copyright 2023 Zilliz
27
Basic Idea
Vector Databases provide the ability to inject your data via
semantic similarity
Considerations include: scale, performance, and flexibility
28 | © Copyright 2023 Zilliz
28
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]
Man = [0.5, 0.2]
29 | © Copyright 2023 Zilliz
29
Vector
Databases
Where do Vectors Come From?
30 | © Copyright 2023 Zilliz
30
Basic Functionality
31 | © Copyright 2023 Zilliz
31
Why Not Use a SQL/NoSQL Database?
• Inefficiency in High-dimensional spaces
• Suboptimal Indexing
• Inadequate query support
• Lack of scalability
• Limited analytics capabilities
• Data conversion issues
TL;DR: Vector operations are too computationally intensive for
traditional database infrastructures
32 | © Copyright 2023 Zilliz
32
Why Not Use a Vector Search Library?
• Have to manually implement filtering
• Not optimized to take advantage of the latest hardware
• Unable to handle large scale data
• Lack of lifecycle management
• Inefficient indexing capabilities
• No built in safety mechanisms
TL;DR: Vector search libraries lack the infrastructure to help you scale,
deploy, and manage your apps in production.
33 | © Copyright 2023 Zilliz
33
What is Milvus/Zilliz ideal for?
• Advanced filtering
• Hybrid search
• Durability and backups
• Replications/High Availability
• Sharding
• Aggregations
• Lifecycle management
• Multi-tenancy
• High query load
• High insertion/deletion
• Full precision/recall
• Accelerator support (GPU,
FPGA)
• Billion-scale storage
Purpose-built to store, index and query vector embeddings from unstructured data at scale.
34 | © Copyright 2023 Zilliz
34
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
Access Layer
Query Node Data Node Index Node
High-level overview of Milvus’ Architecture
35 | © Copyright 2023 Zilliz
35
Milvus Architecture: Differentiation
1. Cloud Native, Distributed System Architecture
2. True Separation of Concerns
3. Scalable Index Creation Strategy with 512 MB Segments
36 | © Copyright 2023 Zilliz
36
Vector Databases are purpose-built to handle
indexing, storing, and querying vector data.
Milvus & Zilliz are specifically designed for high
performance and billion+ scale use cases.
Takeaway:
37 | © Copyright 2023 Zilliz
37
Vector Database Resources
Give Milvus a Star! Chat with me on Discord!
38 | © Copyright 2023 Zilliz
38
Start building
with Zilliz Cloud today!
zilliz.com/cloud

More Related Content

Similar to Chunking, Embeddings, and Vector Databases

Digital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdfDigital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdf
ssuserd23711
 
Data virtualization
Data virtualizationData virtualization
Data virtualization
Hamed Hatami
 

Similar to Chunking, Embeddings, and Vector Databases (20)

Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
 
State of DevOps - Build the Thing Right
State of DevOps - Build the Thing RightState of DevOps - Build the Thing Right
State of DevOps - Build the Thing Right
 
Digital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdfDigital_IOT_(Microsoft_Solution).pdf
Digital_IOT_(Microsoft_Solution).pdf
 
Redefining the Cloud with AI – State & Use Cases​ | SoftClouds
Redefining the Cloud with AI – State & Use Cases​ | SoftCloudsRedefining the Cloud with AI – State & Use Cases​ | SoftClouds
Redefining the Cloud with AI – State & Use Cases​ | SoftClouds
 
Data virtualization
Data virtualizationData virtualization
Data virtualization
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Pitfalls & Challenges Faced During a Microservices Architecture Implementation
Pitfalls & Challenges Faced During a Microservices Architecture ImplementationPitfalls & Challenges Faced During a Microservices Architecture Implementation
Pitfalls & Challenges Faced During a Microservices Architecture Implementation
 
Keynote: Art of the Possible - Moore
Keynote: Art of the Possible - MooreKeynote: Art of the Possible - Moore
Keynote: Art of the Possible - Moore
 
Hybrid Cloud - Key Benefits & Must Have Requirements
Hybrid Cloud - Key Benefits & Must Have RequirementsHybrid Cloud - Key Benefits & Must Have Requirements
Hybrid Cloud - Key Benefits & Must Have Requirements
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Cloud Architecture in the Data Center
Cloud Architecture in the Data CenterCloud Architecture in the Data Center
Cloud Architecture in the Data Center
 
Performance Testing
Performance TestingPerformance Testing
Performance Testing
 
Cloud Migration.pdf
Cloud Migration.pdfCloud Migration.pdf
Cloud Migration.pdf
 
Unleashing the Power of SAP BusinessObjects Metadata with 360Suite
Unleashing the Power of SAP BusinessObjects Metadata with 360SuiteUnleashing the Power of SAP BusinessObjects Metadata with 360Suite
Unleashing the Power of SAP BusinessObjects Metadata with 360Suite
 
The Hacking Games - Security vs Productivity and Operational Efficiency 20230119
The Hacking Games - Security vs Productivity and Operational Efficiency 20230119The Hacking Games - Security vs Productivity and Operational Efficiency 20230119
The Hacking Games - Security vs Productivity and Operational Efficiency 20230119
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery
 
Community Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare SectorCommunity Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare Sector
 
Assessing the Business Value of SDN Datacenter Security Solutions
Assessing the Business Value of SDN Datacenter Security SolutionsAssessing the Business Value of SDN Datacenter Security Solutions
Assessing the Business Value of SDN Datacenter Security Solutions
 
IRJET- Redsc: Reliablity of Data Sharing in Cloud
IRJET- Redsc: Reliablity of Data Sharing in CloudIRJET- Redsc: Reliablity of Data Sharing in Cloud
IRJET- Redsc: Reliablity of Data Sharing in Cloud
 
Using power bi in hybrid it
Using power bi in hybrid itUsing power bi in hybrid it
Using power bi in hybrid it
 

More from Zilliz

More from Zilliz (18)

Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
Emergent Methods: Multilingual narrative tracking in the news - real-time exp...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Zilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in MLZilliz - Overview of Generative models in ML
Zilliz - Overview of Generative models in ML
 
Integrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with FloomIntegrating Multimodal AI in Your Apps with Floom
Integrating Multimodal AI in Your Apps with Floom
 
Build streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and ZillizBuild streaming LLM with Timeplus and Zilliz
Build streaming LLM with Timeplus and Zilliz
 
Beyond Retrieval Augmented Generation (RAG): Vector Databases
Beyond Retrieval Augmented Generation (RAG): Vector DatabasesBeyond Retrieval Augmented Generation (RAG): Vector Databases
Beyond Retrieval Augmented Generation (RAG): Vector Databases
 
Introduction to Large Language Model Customization.pdf
Introduction to Large Language Model Customization.pdfIntroduction to Large Language Model Customization.pdf
Introduction to Large Language Model Customization.pdf
 
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAGVoyage AI: cutting-edge embeddings and rerankers for search and RAG
Voyage AI: cutting-edge embeddings and rerankers for search and RAG
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
VectorDB Schema Design 101 - Considerations for Building a Scalable and Perfo...
 
Voyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented GenerationVoyage AI Embedding Models for Retrieval Augmented Generation
Voyage AI Embedding Models for Retrieval Augmented Generation
 
Chat with your data, privately and locally
Chat with your data, privately and locallyChat with your data, privately and locally
Chat with your data, privately and locally
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 release
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 

Chunking, Embeddings, and Vector Databases

  • 1. 1 | © Copyright 2023 Zilliz 1 Yujian Tang | Zilliz Chunking, Embeddings, and Vector Databases
  • 2. 2 | © Copyright 2023 Zilliz 2 Yujian Tang Senior Developer Advocate, Zilliz yujian@zilliz.com https://www.linkedin.com/in/yujiantang https://www.twitter.com/yujian_tang Speaker
  • 3. 3 | © Copyright 2023 Zilliz 3 GPT-Cache Key maintainer of the following Open-Source projects Founded Headquarters Focus 2017 Redwood Shores, CA Vector database company for enterprise-grade AI built on Milvus, the popular open-source vector database that helps organizations quickly create AI applications. Zilliz at a Glance
  • 4. 4 | © Copyright 2023 Zilliz 4 01 Why RAG? CONTENTS 03 04 Vector Databases 02 Chunking Considerations Embedding Models and Strategies
  • 5. 5 | © Copyright 2023 Zilliz 5 01 RAG Overview
  • 6. 6 | © Copyright 2023 Zilliz 6 Basic Idea You want to use your data with a large language model
  • 7. 7 | © Copyright 2023 Zilliz 7 RAG vs Fine Tuning RAG Inject your data via a vector database like Milvus/Zilliz Query LLM Milvus Your Data Primary Use Case - Factual Recall - Forced Data Injection - Cost Optimization
  • 8. 8 | © Copyright 2023 Zilliz 8 RAG vs Fine Tuning LLM Fine Tuning Augment an LLM by training it on your data Your Data “New” LLM Query Primary Use Case - Style transfer - Domain specific usage
  • 9. 9 | © Copyright 2023 Zilliz 9 Takeaway Use RAG to force the LLM to work with your data by injecting it via a vector database like Milvus or Zilliz
  • 10. 10 | © Copyright 2023 Zilliz 10 02 Chunking
  • 11. 11 | © Copyright 2023 Zilliz 11 Chunking Considerations Chunk Size Chunk Overlap Character Splitters
  • 12. 12 | © Copyright 2023 Zilliz 12 How Does Your Data Look? Conversation Data Documentation Data Lecture or Q/A Data
  • 13. 13 | © Copyright 2023 Zilliz 13 Examples
  • 14. 14 | © Copyright 2023 Zilliz 14 Examples
  • 15. 15 | © Copyright 2023 Zilliz 15 Examples
  • 16. 16 | © Copyright 2023 Zilliz 16 Examples
  • 17. 17 | © Copyright 2023 Zilliz 17 Examples
  • 18. 18 | © Copyright 2023 Zilliz 18 Examples
  • 19. 19 | © Copyright 2023 Zilliz 19 Your chunking strategy depends on what your data looks like and what you need from it. Takeaway:
  • 20. 20 | © Copyright 2023 Zilliz 20 03 Embeddings
  • 21. 21 | © Copyright 2023 Zilliz 21 Examining Embeddings Picking a model What to embed Metadata
  • 22. 22 | © Copyright 2023 Zilliz 22 Embeddings Models
  • 23. 23 | © Copyright 2023 Zilliz 23 Embeddings Strategies Level 1: Embedding Chunks Directly Level 2: Embedding Sub and Super Chunks Level 3: Incorporating Chunking and Non-Chunking Metadata
  • 24. 24 | © Copyright 2023 Zilliz 24 Metadata Examples Chunking - Paragraph position - Section header - Larger paragraph - Sentence Number - … Non-Chunking - Author - Publisher - Organization - Role Based Access Control - …
  • 25. 25 | © Copyright 2023 Zilliz 25 Your embeddings strategy depends on your accuracy, cost, and use case needs Takeaway:
  • 26. 26 | © Copyright 2023 Zilliz 26 04 Vector Databases
  • 27. 27 | © Copyright 2023 Zilliz 27 Basic Idea Vector Databases provide the ability to inject your data via semantic similarity Considerations include: scale, performance, and flexibility
  • 28. 28 | © Copyright 2023 Zilliz 28 Semantic Similarity Image from Sutor et al Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Man = [0.5, 0.2] Queen - Woman + Man = King Queen = [0.3, 0.9] - Woman = [0.3, 0.4] [0.0, 0.5] + Man = [0.5, 0.2] King = [0.5, 0.7] Man = [0.5, 0.2]
  • 29. 29 | © Copyright 2023 Zilliz 29 Vector Databases Where do Vectors Come From?
  • 30. 30 | © Copyright 2023 Zilliz 30 Basic Functionality
  • 31. 31 | © Copyright 2023 Zilliz 31 Why Not Use a SQL/NoSQL Database? • Inefficiency in High-dimensional spaces • Suboptimal Indexing • Inadequate query support • Lack of scalability • Limited analytics capabilities • Data conversion issues TL;DR: Vector operations are too computationally intensive for traditional database infrastructures
  • 32. 32 | © Copyright 2023 Zilliz 32 Why Not Use a Vector Search Library? • Have to manually implement filtering • Not optimized to take advantage of the latest hardware • Unable to handle large scale data • Lack of lifecycle management • Inefficient indexing capabilities • No built in safety mechanisms TL;DR: Vector search libraries lack the infrastructure to help you scale, deploy, and manage your apps in production.
  • 33. 33 | © Copyright 2023 Zilliz 33 What is Milvus/Zilliz ideal for? • Advanced filtering • Hybrid search • Durability and backups • Replications/High Availability • Sharding • Aggregations • Lifecycle management • Multi-tenancy • High query load • High insertion/deletion • Full precision/recall • Accelerator support (GPU, FPGA) • Billion-scale storage Purpose-built to store, index and query vector embeddings from unstructured data at scale.
  • 34. 34 | © Copyright 2023 Zilliz 34 Meta Storage Root Query Data Index Coordinator Service Proxy Proxy etcd Log Broker SDK Load Balancer DDL/DCL DML NOTIFICATION CONTROL SIGNAL Object Storage Minio / S3 / AzureBlob Log Snapshot Delta File Index File Worker Node QUERY DATA DATA Message Storage Access Layer Query Node Data Node Index Node High-level overview of Milvus’ Architecture
  • 35. 35 | © Copyright 2023 Zilliz 35 Milvus Architecture: Differentiation 1. Cloud Native, Distributed System Architecture 2. True Separation of Concerns 3. Scalable Index Creation Strategy with 512 MB Segments
  • 36. 36 | © Copyright 2023 Zilliz 36 Vector Databases are purpose-built to handle indexing, storing, and querying vector data. Milvus & Zilliz are specifically designed for high performance and billion+ scale use cases. Takeaway:
  • 37. 37 | © Copyright 2023 Zilliz 37 Vector Database Resources Give Milvus a Star! Chat with me on Discord!
  • 38. 38 | © Copyright 2023 Zilliz 38 Start building with Zilliz Cloud today! zilliz.com/cloud