RAG Patterns and Vector Search
in Generative AI
Udaiappa Ramachandran ( Udai )
https://udai.io
About me
• Udaiappa Ramachandran ( Udai )
• CTO/CSO-Akumina, Inc.
• Microsoft Azure MVP
• Cloud Expert
• Microsoft Azure, Amazon Web Services, and Google
• New Hampshire Cloud User Group (http://www.meetup.com/nashuaug )
• https://udai.io
Agenda
• Keyword Search
• Vector Search
• Hybrid Search
• Open AI vector embedding
• Azure Cognitive Search
• Demo…Demo…Demo…
Keyword Search
• Pros:
• Simple and easy to use
• Fast and efficient
• Scalable to very large data sets
• Well supported by existing search engines and other tools
• cost-effective
• easy to implement
• Cons:
• Can be inaccurate for ambiguous or complex queries
• Sensitive to typos and misspellings
• Does not understand the semantic relationships between words
• Language barriers
Keyword Search
Vector Embedded Search
• Pros:
• Better at understanding the semantics of language
• Can handle complex and ambiguous queries
• More robust to typos and misspellings
• Can be used for cross-lingual search
• Cons:
• Computationally expensive
• More difficult to implement and maintain
• Requires a large dataset of pre-trained embeddings
• Not as well-supported by existing search engines and other tools
Vector Search
Hybrid Search
• Combines both keyword search and vector search
• Retrieve using keyword search then refine using vector search to rerank
• Benefits:
• Improved accuracy
• Increased relevance
• Wider range of queries
Vector Indexes in real-world applications
• Product recommendation: Amazon uses hybrid search to recommend products to its customers.
The search engine considers both the keywords that the customer has searched for and the
customer's past purchase history.
• Anomaly detection: Credit card companies use hybrid search to detect fraudulent transactions. The
search engine considers both the transaction amount and the location of the transaction.
• Document search: Google Scholar uses hybrid search to rank academic papers. The search engine
considers both the keywords in the paper's title and abstract, as well as the citations that the paper
has received.
• Google Search: Google search uses vector indexes to store and retrieve document embeddings.
This allows Google to efficiently search and rank billions of web pages
• Facebook Recommendations: Facebook uses vector indexes to store and retrieve user embeddings
and item embeddings. This allows Facebook to recommend relevant content to its users
• Netflix Recommendations: Netflix uses vector indexes to store and retrieve user embeddings and
movie embeddings. This allows Netflix to recommend relevant movies to its users.
Cosine Similarity
Cosine similarity is a measure of similarity between two vectors. Mathematically, it is calculated by taking the dot product of the two vectors and dividing by the product of
their magnitude’s
cosine_similarity(x,y)=dot(x,y)/(||x|| * ||y||)
where
• x and y are two vectors
• dot(x,y) is the dot product of the two vectors
• ||x|| and ||y|| are the magnitudes of the vectors
To illustrate cosine similarity with a hypothetical example, let's say we have two vectors x and y:
x = [3, 2] (This represents vector x with two components, 3 and 2)
y = [1, 4] (This represents vector y with two components, 1 and 4)
Calculate the dot product of x and y: x • y = (3 * 1) + (2 * 4) = 3 + 8 = 11
Calculate the magnitudes of vectors x and y:
||x|| = √(3^2 + 2^2) = √(9 + 4) = √13
||y|| = √(1^2 + 4^2) = √(1 + 16) = √17
Calculate the cosine similarity:
cos(θ) = (x • y) / (||x|| * ||y||) = 11 / (√13 * √17) ≈ 0.745
Vector Databases
• Pinecone
• Managed service, high performance, hybrid storage (in memory and disk)
• Qdrant
• Open-source, highly scalable, filtering
• Weaviate
• Open-source, semantic search, modular design (let you pick the best machine learning model)
• Millvus
• Open-source, cloud-native, Trillian-scale search
• Faiss
• Library, not a database (by Facebook), advanced algorithms, integration (Faiss excels when
integrated with traditional databases for added vector search capability)
Why Azure Congnitive Search?
• Key Word Search
• Vector Search
• Hybrid Search
• Advanced filtering
• Semantic ( L2 reranking)
• Built-in chunking
• Bring your own vector
Retrieval Augmented Generation (RAG)
https://polite-ground-030dc3103.4.azurestaticapps.net/event/c555-ee52
Retrieval Augmented Generation (RAG)
Retrieval Augmented Generation (RAG)
Building RAG applications
• Azure AI Studio with Prompt flow
• Co-Pilot studio
• Semantic Kernel
Demo
• Vector embedding
• Vector Search
• RAG Application
Reference
• https://github.com/Azure/cognitive-search-vector-pr
• https://learn.microsoft.com/en-us/azure/search/
• https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-
overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking
• https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-
to/image-retrieval
• https://huggingface.co/spaces/mteb/leaderboard
Thanks for your time and trust!

RAG Patterns and Vector Search in Generative AI

  • 1.
    RAG Patterns andVector Search in Generative AI Udaiappa Ramachandran ( Udai ) https://udai.io
  • 2.
    About me • UdaiappaRamachandran ( Udai ) • CTO/CSO-Akumina, Inc. • Microsoft Azure MVP • Cloud Expert • Microsoft Azure, Amazon Web Services, and Google • New Hampshire Cloud User Group (http://www.meetup.com/nashuaug ) • https://udai.io
  • 3.
    Agenda • Keyword Search •Vector Search • Hybrid Search • Open AI vector embedding • Azure Cognitive Search • Demo…Demo…Demo…
  • 4.
    Keyword Search • Pros: •Simple and easy to use • Fast and efficient • Scalable to very large data sets • Well supported by existing search engines and other tools • cost-effective • easy to implement • Cons: • Can be inaccurate for ambiguous or complex queries • Sensitive to typos and misspellings • Does not understand the semantic relationships between words • Language barriers
  • 5.
  • 6.
    Vector Embedded Search •Pros: • Better at understanding the semantics of language • Can handle complex and ambiguous queries • More robust to typos and misspellings • Can be used for cross-lingual search • Cons: • Computationally expensive • More difficult to implement and maintain • Requires a large dataset of pre-trained embeddings • Not as well-supported by existing search engines and other tools
  • 7.
  • 8.
    Hybrid Search • Combinesboth keyword search and vector search • Retrieve using keyword search then refine using vector search to rerank • Benefits: • Improved accuracy • Increased relevance • Wider range of queries
  • 9.
    Vector Indexes inreal-world applications • Product recommendation: Amazon uses hybrid search to recommend products to its customers. The search engine considers both the keywords that the customer has searched for and the customer's past purchase history. • Anomaly detection: Credit card companies use hybrid search to detect fraudulent transactions. The search engine considers both the transaction amount and the location of the transaction. • Document search: Google Scholar uses hybrid search to rank academic papers. The search engine considers both the keywords in the paper's title and abstract, as well as the citations that the paper has received. • Google Search: Google search uses vector indexes to store and retrieve document embeddings. This allows Google to efficiently search and rank billions of web pages • Facebook Recommendations: Facebook uses vector indexes to store and retrieve user embeddings and item embeddings. This allows Facebook to recommend relevant content to its users • Netflix Recommendations: Netflix uses vector indexes to store and retrieve user embeddings and movie embeddings. This allows Netflix to recommend relevant movies to its users.
  • 10.
    Cosine Similarity Cosine similarityis a measure of similarity between two vectors. Mathematically, it is calculated by taking the dot product of the two vectors and dividing by the product of their magnitude’s cosine_similarity(x,y)=dot(x,y)/(||x|| * ||y||) where • x and y are two vectors • dot(x,y) is the dot product of the two vectors • ||x|| and ||y|| are the magnitudes of the vectors To illustrate cosine similarity with a hypothetical example, let's say we have two vectors x and y: x = [3, 2] (This represents vector x with two components, 3 and 2) y = [1, 4] (This represents vector y with two components, 1 and 4) Calculate the dot product of x and y: x • y = (3 * 1) + (2 * 4) = 3 + 8 = 11 Calculate the magnitudes of vectors x and y: ||x|| = √(3^2 + 2^2) = √(9 + 4) = √13 ||y|| = √(1^2 + 4^2) = √(1 + 16) = √17 Calculate the cosine similarity: cos(θ) = (x • y) / (||x|| * ||y||) = 11 / (√13 * √17) ≈ 0.745
  • 11.
    Vector Databases • Pinecone •Managed service, high performance, hybrid storage (in memory and disk) • Qdrant • Open-source, highly scalable, filtering • Weaviate • Open-source, semantic search, modular design (let you pick the best machine learning model) • Millvus • Open-source, cloud-native, Trillian-scale search • Faiss • Library, not a database (by Facebook), advanced algorithms, integration (Faiss excels when integrated with traditional databases for added vector search capability)
  • 12.
    Why Azure CongnitiveSearch? • Key Word Search • Vector Search • Hybrid Search • Advanced filtering • Semantic ( L2 reranking) • Built-in chunking • Bring your own vector
  • 13.
    Retrieval Augmented Generation(RAG) https://polite-ground-030dc3103.4.azurestaticapps.net/event/c555-ee52
  • 14.
  • 15.
  • 16.
    Building RAG applications •Azure AI Studio with Prompt flow • Co-Pilot studio • Semantic Kernel
  • 17.
    Demo • Vector embedding •Vector Search • RAG Application
  • 18.
    Reference • https://github.com/Azure/cognitive-search-vector-pr • https://learn.microsoft.com/en-us/azure/search/ •https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation- overview • https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview • https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking • https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how- to/image-retrieval • https://huggingface.co/spaces/mteb/leaderboard
  • 19.
    Thanks for yourtime and trust!

Editor's Notes

  • #9 Hybrid search combines keyword search and vector search to achieve the best of both worlds. It uses keyword search to quickly identify the most relevant documents, and then uses vector search to refine and rerank those results. Benefits of Hybrid Search Hybrid search has several benefits over traditional keyword search: Improved accuracy: Hybrid search is able to handle complex and ambiguous queries more accurately than keyword search. Increased relevance: Hybrid search is able to rank results by relevance more effectively than keyword search. Wider range of queries: Hybrid search can handle a wider range of queries, including natural language queries and questions. Applications of Hybrid Search Hybrid search is being used in a variety of applications, including: Product recommendation: Hybrid search is used to recommend products to customers based on their past purchases and browsing history. Anomaly detection: Hybrid search is used to detect fraudulent transactions and other anomalies. Document search: Hybrid search is used to search for documents in large document repositories. Example of Hybrid Search Consider a user who is searching for "best restaurants near me." A keyword search would likely return a list of restaurants that are located near the user. However, this list might not include the best restaurants in the area. A hybrid search would use vector search to refine the results by considering factors such as the restaurant's cuisine, price range, and ambiance. The hybrid search would then rank the results by relevance, taking into account both the keyword search results and the vector search results. Conclusion Hybrid search is a powerful tool that can improve the accuracy and relevance of search results. It is a promising new technology that is likely to become even more popular in the years to come.
  • #11 In this example, the cosine similarity between vectors A and B is approximately 0.745. The value of cosine similarity ranges from -1 (completely dissimilar) to 1 (completely similar), with 0 indicating orthogonality (no similarity). A higher cosine similarity score indicates a stronger similarity between the vectors.
  • #14  Retrieval Augmented Generation -- adding a context into prompt embedding - semantic representation of bit of text build basic prompt, run flow against data, evaluate prompt flow, modify flow, run flow against larger dataset, evaluate prompt flow
  • #19 https://gloveboxes.github.io/prompt_flow_workshop/cheat_sheet/