RAG Patterns and Vector Search in Generative AI

RAG Patterns and Vector Search
in Generative AI
Udaiappa Ramachandran ( Udai )
https://udai.io

About me
• Udaiappa Ramachandran ( Udai )
• CTO/CSO-Akumina, Inc.
• Microsoft Azure MVP
• Cloud Expert
• Microsoft Azure, Amazon Web Services, and Google
• New Hampshire Cloud User Group (http://www.meetup.com/nashuaug )
• https://udai.io

Agenda
• Keyword Search
• Vector Search
• Hybrid Search
• Open AI vector embedding
• Azure Cognitive Search
• Demo…Demo…Demo…

Keyword Search
• Pros:
• Simple and easy to use
• Fast and efficient
• Scalable to very large data sets
• Well supported by existing search engines and other tools
• cost-effective
• easy to implement
• Cons:
• Can be inaccurate for ambiguous or complex queries
• Sensitive to typos and misspellings
• Does not understand the semantic relationships between words
• Language barriers

Vector Embedded Search
• Pros:
• Better at understanding the semantics of language
• Can handle complex and ambiguous queries
• More robust to typos and misspellings
• Can be used for cross-lingual search
• Cons:
• Computationally expensive
• More difficult to implement and maintain
• Requires a large dataset of pre-trained embeddings
• Not as well-supported by existing search engines and other tools

Hybrid Search
• Combines both keyword search and vector search
• Retrieve using keyword search then refine using vector search to rerank
• Benefits:
• Improved accuracy
• Increased relevance
• Wider range of queries

Vector Indexes in real-world applications
• Product recommendation: Amazon uses hybrid search to recommend products to its customers.
The search engine considers both the keywords that the customer has searched for and the
customer's past purchase history.
• Anomaly detection: Credit card companies use hybrid search to detect fraudulent transactions. The
search engine considers both the transaction amount and the location of the transaction.
• Document search: Google Scholar uses hybrid search to rank academic papers. The search engine
considers both the keywords in the paper's title and abstract, as well as the citations that the paper
has received.
• Google Search: Google search uses vector indexes to store and retrieve document embeddings.
This allows Google to efficiently search and rank billions of web pages
• Facebook Recommendations: Facebook uses vector indexes to store and retrieve user embeddings
and item embeddings. This allows Facebook to recommend relevant content to its users
• Netflix Recommendations: Netflix uses vector indexes to store and retrieve user embeddings and
movie embeddings. This allows Netflix to recommend relevant movies to its users.

Cosine Similarity
Cosine similarity is a measure of similarity between two vectors. Mathematically, it is calculated by taking the dot product of the two vectors and dividing by the product of
their magnitude’s
cosine_similarity(x,y)=dot(x,y)/(||x|| * ||y||)
where
• x and y are two vectors
• dot(x,y) is the dot product of the two vectors
• ||x|| and ||y|| are the magnitudes of the vectors
To illustrate cosine similarity with a hypothetical example, let's say we have two vectors x and y:
x = [3, 2] (This represents vector x with two components, 3 and 2)
y = [1, 4] (This represents vector y with two components, 1 and 4)
Calculate the dot product of x and y: x • y = (3 * 1) + (2 * 4) = 3 + 8 = 11
Calculate the magnitudes of vectors x and y:
||x|| = √(3^2 + 2^2) = √(9 + 4) = √13
||y|| = √(1^2 + 4^2) = √(1 + 16) = √17
Calculate the cosine similarity:
cos(θ) = (x • y) / (||x|| * ||y||) = 11 / (√13 * √17) ≈ 0.745

Vector Databases
• Pinecone
• Managed service, high performance, hybrid storage (in memory and disk)
• Qdrant
• Open-source, highly scalable, filtering
• Weaviate
• Open-source, semantic search, modular design (let you pick the best machine learning model)
• Millvus
• Open-source, cloud-native, Trillian-scale search
• Faiss
• Library, not a database (by Facebook), advanced algorithms, integration (Faiss excels when
integrated with traditional databases for added vector search capability)

Why Azure Congnitive Search?
• Key Word Search
• Vector Search
• Hybrid Search
• Advanced filtering
• Semantic ( L2 reranking)
• Built-in chunking
• Bring your own vector

Retrieval Augmented Generation (RAG)
https://polite-ground-030dc3103.4.azurestaticapps.net/event/c555-ee52

Retrieval Augmented Generation (RAG)

Building RAG applications
• Azure AI Studio with Prompt flow
• Co-Pilot studio
• Semantic Kernel

Demo
• Vector embedding
• Vector Search
• RAG Application

Reference
• https://github.com/Azure/cognitive-search-vector-pr
• https://learn.microsoft.com/en-us/azure/search/
• https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-
overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview
• https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking
• https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-
to/image-retrieval
• https://huggingface.co/spaces/mteb/leaderboard

Thanks for your time and trust!

RAG Patterns and Vector Search in Generative AI

More Related Content

What's hot

Similar to RAG Patterns and Vector Search in Generative AI

More from Udaiappa Ramachandran

Recently uploaded

RAG Patterns and Vector Search in Generative AI

Editor's Notes