SlideShare a Scribd company logo
1 of 19
RAG Patterns and Vector Search
in Generative AI
Udaiappa Ramachandran ( Udai )
https://udai.io
About me
• Udaiappa Ramachandran ( Udai )
• CTO/CSO-Akumina, Inc.
• Microsoft Azure MVP
• Cloud Expert
• Microsoft Azure, Amazon Web Services, and Google
• New Hampshire Cloud User Group (http://www.meetup.com/nashuaug )
• https://udai.io
Agenda
• Keyword Search
• Vector Search
• Hybrid Search
• Open AI vector embedding
• Azure Cognitive Search
• Demo…Demo…Demo…
Keyword Search
• Pros:
• Simple and easy to use
• Fast and efficient
• Scalable to very large data sets
• Well supported by existing search engines and other tools
• cost-effective
• easy to implement
• Cons:
• Can be inaccurate for ambiguous or complex queries
• Sensitive to typos and misspellings
• Does not understand the semantic relationships between words
• Language barriers
Keyword Search
Vector Embedded Search
• Pros:
• Better at understanding the semantics of language
• Can handle complex and ambiguous queries
• More robust to typos and misspellings
• Can be used for cross-lingual search
• Cons:
• Computationally expensive
• More difficult to implement and maintain
• Requires a large dataset of pre-trained embeddings
• Not as well-supported by existing search engines and other tools
Vector Search
Hybrid Search
• Combines both keyword search and vector search
• Retrieve using keyword search then refine using vector search to rerank
• Benefits:
• Improved accuracy
• Increased relevance
• Wider range of queries
Vector Indexes in real-world applications
• Product recommendation: Amazon uses hybrid search to recommend products to its customers.
The search engine considers both the keywords that the customer has searched for and the
customer's past purchase history.
• Anomaly detection: Credit card companies use hybrid search to detect fraudulent transactions. The
search engine considers both the transaction amount and the location of the transaction.
• Document search: Google Scholar uses hybrid search to rank academic papers. The search engine
considers both the keywords in the paper's title and abstract, as well as the citations that the paper
has received.
• Google Search: Google search uses vector indexes to store and retrieve document embeddings.
This allows Google to efficiently search and rank billions of web pages
• Facebook Recommendations: Facebook uses vector indexes to store and retrieve user embeddings
and item embeddings. This allows Facebook to recommend relevant content to its users
• Netflix Recommendations: Netflix uses vector indexes to store and retrieve user embeddings and
movie embeddings. This allows Netflix to recommend relevant movies to its users.
Cosine Similarity
Cosine similarity is a measure of similarity between two vectors. Mathematically, it is calculated by taking the dot product of the two vectors and dividing by the product of
their magnitude’s
cosine_similarity(x,y)=dot(x,y)/(||x|| * ||y||)
where
• x and y are two vectors
• dot(x,y) is the dot product of the two vectors
• ||x|| and ||y|| are the magnitudes of the vectors
To illustrate cosine similarity with a hypothetical example, let's say we have two vectors x and y:
x = [3, 2] (This represents vector x with two components, 3 and 2)
y = [1, 4] (This represents vector y with two components, 1 and 4)
Calculate the dot product of x and y: x • y = (3 * 1) + (2 * 4) = 3 + 8 = 11
Calculate the magnitudes of vectors x and y:
||x|| = √(3^2 + 2^2) = √(9 + 4) = √13
||y|| = √(1^2 + 4^2) = √(1 + 16) = √17
Calculate the cosine similarity:
cos(θ) = (x • y) / (||x|| * ||y||) = 11 / (√13 * √17) ≈ 0.745
Vector Databases
• Pinecone
• Managed service, high performance, hybrid storage (in memory and disk)
• Qdrant
• Open-source, highly scalable, filtering
• Weaviate
• Open-source, semantic search, modular design (let you pick the best machine learning model)
• Millvus
• Open-source, cloud-native, Trillian-scale search
• Faiss
• Library, not a database (by Facebook), advanced algorithms, integration (Faiss excels when
integrated with traditional databases for added vector search capability)
Why Azure Congnitive Search?
• Key Word Search
• Vector Search
• Hybrid Search
• Advanced filtering
• Semantic ( L2 reranking)
• Built-in chunking
• Bring your own vector
Retrieval Augmented Generation (RAG)
https://polite-ground-030dc3103.4.azurestaticapps.net/event/c555-ee52
Retrieval Augmented Generation (RAG)
Retrieval Augmented Generation (RAG)
Building RAG applications
• Azure AI Studio with Prompt flow
• Co-Pilot studio
• Semantic Kernel
Demo
• Vector embedding
• Vector Search
• RAG Application
Reference
• https://github.com/Azure/cognitive-search-vector-pr
• https://learn.microsoft.com/en-us/azure/search/
• https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-
overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking
• https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-
to/image-retrieval
• https://huggingface.co/spaces/mteb/leaderboard
Thanks for your time and trust!

More Related Content

Similar to RAG Patterns and Vector Search in Generative AI

Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure applicationCodecamp Romania
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousingVaishnavi
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Lucidworks
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMateusz Dymczyk
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphTrey Grainger
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Searchmasiclat
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!dclsocialmedia
 
How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...Petter Skodvin-Hvammen
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 

Similar to RAG Patterns and Vector Search in Generative AI (20)

Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Search
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!
 
How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
1. oose
1. oose1. oose
1. oose
 

More from Udaiappa Ramachandran (20)

Level up your security using Intune.pptx
Level up your security using Intune.pptxLevel up your security using Intune.pptx
Level up your security using Intune.pptx
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
AI-Plugins-Planners-Persona-SemanticKernel.pptx
AI-Plugins-Planners-Persona-SemanticKernel.pptxAI-Plugins-Planners-Persona-SemanticKernel.pptx
AI-Plugins-Planners-Persona-SemanticKernel.pptx
 
DOTNET8.pptx
DOTNET8.pptxDOTNET8.pptx
DOTNET8.pptx
 
AzureSynapse.pptx
AzureSynapse.pptxAzureSynapse.pptx
AzureSynapse.pptx
 
SecureAzureServicesUsingADAuthentication.pptx
SecureAzureServicesUsingADAuthentication.pptxSecureAzureServicesUsingADAuthentication.pptx
SecureAzureServicesUsingADAuthentication.pptx
 
AzureOpenAI.pptx
AzureOpenAI.pptxAzureOpenAI.pptx
AzureOpenAI.pptx
 
OpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptxOpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptx
 
DiagnoseAndSolveproblems.pptx
DiagnoseAndSolveproblems.pptxDiagnoseAndSolveproblems.pptx
DiagnoseAndSolveproblems.pptx
 
MAUI.pptx
MAUI.pptxMAUI.pptx
MAUI.pptx
 
CosmosDB.pptx
CosmosDB.pptxCosmosDB.pptx
CosmosDB.pptx
 
.NET7.pptx
.NET7.pptx.NET7.pptx
.NET7.pptx
 
AzureDevOps
AzureDevOpsAzureDevOps
AzureDevOps
 
AzureCostManagementAndBilling
AzureCostManagementAndBillingAzureCostManagementAndBilling
AzureCostManagementAndBilling
 
.NET6.pptx
.NET6.pptx.NET6.pptx
.NET6.pptx
 
Azure Automation and Update Management
Azure Automation and Update ManagementAzure Automation and Update Management
Azure Automation and Update Management
 
Azure staticwebapps
Azure staticwebappsAzure staticwebapps
Azure staticwebapps
 
Azure privatelink
Azure privatelinkAzure privatelink
Azure privatelink
 
Azure Security Center
Azure Security CenterAzure Security Center
Azure Security Center
 
Azure signalr service
Azure signalr serviceAzure signalr service
Azure signalr service
 

Recently uploaded

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 

Recently uploaded (20)

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

RAG Patterns and Vector Search in Generative AI

  • 1. RAG Patterns and Vector Search in Generative AI Udaiappa Ramachandran ( Udai ) https://udai.io
  • 2. About me • Udaiappa Ramachandran ( Udai ) • CTO/CSO-Akumina, Inc. • Microsoft Azure MVP • Cloud Expert • Microsoft Azure, Amazon Web Services, and Google • New Hampshire Cloud User Group (http://www.meetup.com/nashuaug ) • https://udai.io
  • 3. Agenda • Keyword Search • Vector Search • Hybrid Search • Open AI vector embedding • Azure Cognitive Search • Demo…Demo…Demo…
  • 4. Keyword Search • Pros: • Simple and easy to use • Fast and efficient • Scalable to very large data sets • Well supported by existing search engines and other tools • cost-effective • easy to implement • Cons: • Can be inaccurate for ambiguous or complex queries • Sensitive to typos and misspellings • Does not understand the semantic relationships between words • Language barriers
  • 6. Vector Embedded Search • Pros: • Better at understanding the semantics of language • Can handle complex and ambiguous queries • More robust to typos and misspellings • Can be used for cross-lingual search • Cons: • Computationally expensive • More difficult to implement and maintain • Requires a large dataset of pre-trained embeddings • Not as well-supported by existing search engines and other tools
  • 8. Hybrid Search • Combines both keyword search and vector search • Retrieve using keyword search then refine using vector search to rerank • Benefits: • Improved accuracy • Increased relevance • Wider range of queries
  • 9. Vector Indexes in real-world applications • Product recommendation: Amazon uses hybrid search to recommend products to its customers. The search engine considers both the keywords that the customer has searched for and the customer's past purchase history. • Anomaly detection: Credit card companies use hybrid search to detect fraudulent transactions. The search engine considers both the transaction amount and the location of the transaction. • Document search: Google Scholar uses hybrid search to rank academic papers. The search engine considers both the keywords in the paper's title and abstract, as well as the citations that the paper has received. • Google Search: Google search uses vector indexes to store and retrieve document embeddings. This allows Google to efficiently search and rank billions of web pages • Facebook Recommendations: Facebook uses vector indexes to store and retrieve user embeddings and item embeddings. This allows Facebook to recommend relevant content to its users • Netflix Recommendations: Netflix uses vector indexes to store and retrieve user embeddings and movie embeddings. This allows Netflix to recommend relevant movies to its users.
  • 10. Cosine Similarity Cosine similarity is a measure of similarity between two vectors. Mathematically, it is calculated by taking the dot product of the two vectors and dividing by the product of their magnitude’s cosine_similarity(x,y)=dot(x,y)/(||x|| * ||y||) where • x and y are two vectors • dot(x,y) is the dot product of the two vectors • ||x|| and ||y|| are the magnitudes of the vectors To illustrate cosine similarity with a hypothetical example, let's say we have two vectors x and y: x = [3, 2] (This represents vector x with two components, 3 and 2) y = [1, 4] (This represents vector y with two components, 1 and 4) Calculate the dot product of x and y: x • y = (3 * 1) + (2 * 4) = 3 + 8 = 11 Calculate the magnitudes of vectors x and y: ||x|| = √(3^2 + 2^2) = √(9 + 4) = √13 ||y|| = √(1^2 + 4^2) = √(1 + 16) = √17 Calculate the cosine similarity: cos(θ) = (x • y) / (||x|| * ||y||) = 11 / (√13 * √17) ≈ 0.745
  • 11. Vector Databases • Pinecone • Managed service, high performance, hybrid storage (in memory and disk) • Qdrant • Open-source, highly scalable, filtering • Weaviate • Open-source, semantic search, modular design (let you pick the best machine learning model) • Millvus • Open-source, cloud-native, Trillian-scale search • Faiss • Library, not a database (by Facebook), advanced algorithms, integration (Faiss excels when integrated with traditional databases for added vector search capability)
  • 12. Why Azure Congnitive Search? • Key Word Search • Vector Search • Hybrid Search • Advanced filtering • Semantic ( L2 reranking) • Built-in chunking • Bring your own vector
  • 13. Retrieval Augmented Generation (RAG) https://polite-ground-030dc3103.4.azurestaticapps.net/event/c555-ee52
  • 16. Building RAG applications • Azure AI Studio with Prompt flow • Co-Pilot studio • Semantic Kernel
  • 17. Demo • Vector embedding • Vector Search • RAG Application
  • 18. Reference • https://github.com/Azure/cognitive-search-vector-pr • https://learn.microsoft.com/en-us/azure/search/ • https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation- overview • https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview • https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking • https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how- to/image-retrieval • https://huggingface.co/spaces/mteb/leaderboard
  • 19. Thanks for your time and trust!

Editor's Notes

  1. Hybrid search combines keyword search and vector search to achieve the best of both worlds. It uses keyword search to quickly identify the most relevant documents, and then uses vector search to refine and rerank those results. Benefits of Hybrid Search Hybrid search has several benefits over traditional keyword search: Improved accuracy: Hybrid search is able to handle complex and ambiguous queries more accurately than keyword search. Increased relevance: Hybrid search is able to rank results by relevance more effectively than keyword search. Wider range of queries: Hybrid search can handle a wider range of queries, including natural language queries and questions. Applications of Hybrid Search Hybrid search is being used in a variety of applications, including: Product recommendation: Hybrid search is used to recommend products to customers based on their past purchases and browsing history. Anomaly detection: Hybrid search is used to detect fraudulent transactions and other anomalies. Document search: Hybrid search is used to search for documents in large document repositories. Example of Hybrid Search Consider a user who is searching for "best restaurants near me." A keyword search would likely return a list of restaurants that are located near the user. However, this list might not include the best restaurants in the area. A hybrid search would use vector search to refine the results by considering factors such as the restaurant's cuisine, price range, and ambiance. The hybrid search would then rank the results by relevance, taking into account both the keyword search results and the vector search results. Conclusion Hybrid search is a powerful tool that can improve the accuracy and relevance of search results. It is a promising new technology that is likely to become even more popular in the years to come.
  2. In this example, the cosine similarity between vectors A and B is approximately 0.745. The value of cosine similarity ranges from -1 (completely dissimilar) to 1 (completely similar), with 0 indicating orthogonality (no similarity). A higher cosine similarity score indicates a stronger similarity between the vectors.
  3. Retrieval Augmented Generation -- adding a context into prompt embedding - semantic representation of bit of text build basic prompt, run flow against data, evaluate prompt flow, modify flow, run flow against larger dataset, evaluate prompt flow
  4. https://gloveboxes.github.io/prompt_flow_workshop/cheat_sheet/