Francisco Javier Arceo
Senior Principal Software Engineer, Red Hat
Kubeflow Steering Committee Member
Feast Maintainer
Feast, RAG,
and Milvus
Hello! 👋
A little about me
Led Data Science, Data Engineering, and ML Infra teams
at different companies
Somehow stumbled into maintaining Feast, the Open
Source Feature store
Get to work on a mixture of distributed training,
pipelines, feature store, RAG, and agents!
In my ample free time I like to write code
I've spent 12+ years building AI/ML
solutions for banks and fintechs
1
Joined Red Hat to work on Open Source AI
2
Wife and 2 children and I call NJ home 🤠
3
What is RAG?
Retrieval Augmented Generation
Published in NeurIPs 2020
Query Encoding
Retriever + Generator
Meta AI Research team
A pretrained encoder
In the seminal paper, they ran end-to-end
backpropagation/fine tuning on both the
Retriever and Generator
Why did RAG become so popular?
OpenAI
Published in NeurIPs 2020
ChatGPT took flight in Oct 2022
Google Trend shows takeoff
Most RAG applications only use
inference 😅
Meta AI Research team
They suggested using RAG 🤯
Easier to do than fine tuning!
How does RAG work?
The Simplest RAG
Embed Data
Take documents/text and convert it into numeric
(vector) representation
Insert Data into datastore
Insert all of that data (often in batch)
Embed User Query
In real-time, embed a user's query
Retrieve Documents with
Vector Similarity Search
Compute the cosine similarity between query
and all other vector representations and return
top k
How can Feast help with RAG?
Empowers MLEs to do what they do best, harness the power of their data!
Easy to ship RAG to
production!
Battle-tested support for real-
time, batch, and streaming
Built to scale for distributed
computing and ingestion
Fine-tuning as a first class
citizen
Fully Open Source!
Feast in Production
Feast values inference and fine tuning as first class citizens
Online Infrastructure
Offline Infrastructure
Scale
For model inference / RAG
For model fine tuning
Kubernetes (Helm + Operator)
Feast 🤝Milvus 🤝Docling
Talk with your Docs!
Feast 🤝Milvus 🤝Docling
Feast Objects
Entities
Data Sources
Feature Views
These are primary keys
Files and Request objects (i.e., a CSV and an
API call)
This defines a collection of features/fields
where we easily can enable vector search
during retrieval
Feast 🤝Milvus 🤝Docling
Document/Data Transformation!
Feast allows for Feature
Transformation in
Decorators!
Batch Compute Engines (e.g., Spark)
Streaming Compute Engines (E.g,. Spark,
Flink)
API Servers (e.g., the Feast Feature Server)
Defines entities, schemas, data sources, and
some other configurations
Allows for MLEs to easily take data to
production
Feast 🤝Milvus 🤝Docling
Document/Data Ingestion
Ingestion in Feast is simple
Supports more scalable
ingestion as well
Several API endpoints available
More details in the docs
Feast Roadmap 🚀
What's on the horizon for Feast?
More NLP!
We want Feast to be the go-to-framework for AI users to customize their RAG
solutions and that means investing more in Milvus
Image Support
Images often benefit from metadata in recommender systems and we intend on
enhancing Feast in this space, in part because the benefits for RAG are very clear
Scaling Batch with Spark and Ray
We plan to continue to invest in the Spark development experience
We plan to add Ray as a new compute engine
Latency Improvements
We want to make Feast blazing fast and have made significant progress here
Thank you!
Here are some useful links:
Feast RAG Blog Post
Feast Documentation
Feast Website
GitHub Repo with Demo
GitHub Demo with Docling Demo

Smarter RAG Pipelines: Scaling Search with Milvus and Feast

  • 1.
    Francisco Javier Arceo SeniorPrincipal Software Engineer, Red Hat Kubeflow Steering Committee Member Feast Maintainer Feast, RAG, and Milvus
  • 2.
    Hello! 👋 A littleabout me Led Data Science, Data Engineering, and ML Infra teams at different companies Somehow stumbled into maintaining Feast, the Open Source Feature store Get to work on a mixture of distributed training, pipelines, feature store, RAG, and agents! In my ample free time I like to write code I've spent 12+ years building AI/ML solutions for banks and fintechs 1 Joined Red Hat to work on Open Source AI 2 Wife and 2 children and I call NJ home 🤠 3
  • 3.
    What is RAG? RetrievalAugmented Generation Published in NeurIPs 2020 Query Encoding Retriever + Generator Meta AI Research team A pretrained encoder In the seminal paper, they ran end-to-end backpropagation/fine tuning on both the Retriever and Generator
  • 4.
    Why did RAGbecome so popular? OpenAI Published in NeurIPs 2020 ChatGPT took flight in Oct 2022 Google Trend shows takeoff Most RAG applications only use inference 😅 Meta AI Research team They suggested using RAG 🤯 Easier to do than fine tuning!
  • 5.
    How does RAGwork? The Simplest RAG Embed Data Take documents/text and convert it into numeric (vector) representation Insert Data into datastore Insert all of that data (often in batch) Embed User Query In real-time, embed a user's query Retrieve Documents with Vector Similarity Search Compute the cosine similarity between query and all other vector representations and return top k
  • 6.
    How can Feasthelp with RAG? Empowers MLEs to do what they do best, harness the power of their data! Easy to ship RAG to production! Battle-tested support for real- time, batch, and streaming Built to scale for distributed computing and ingestion Fine-tuning as a first class citizen Fully Open Source!
  • 7.
    Feast in Production Feastvalues inference and fine tuning as first class citizens Online Infrastructure Offline Infrastructure Scale For model inference / RAG For model fine tuning Kubernetes (Helm + Operator)
  • 8.
  • 9.
    Feast 🤝Milvus 🤝Docling FeastObjects Entities Data Sources Feature Views These are primary keys Files and Request objects (i.e., a CSV and an API call) This defines a collection of features/fields where we easily can enable vector search during retrieval
  • 10.
    Feast 🤝Milvus 🤝Docling Document/DataTransformation! Feast allows for Feature Transformation in Decorators! Batch Compute Engines (e.g., Spark) Streaming Compute Engines (E.g,. Spark, Flink) API Servers (e.g., the Feast Feature Server) Defines entities, schemas, data sources, and some other configurations Allows for MLEs to easily take data to production
  • 11.
    Feast 🤝Milvus 🤝Docling Document/DataIngestion Ingestion in Feast is simple Supports more scalable ingestion as well Several API endpoints available More details in the docs
  • 12.
    Feast Roadmap 🚀 What'son the horizon for Feast? More NLP! We want Feast to be the go-to-framework for AI users to customize their RAG solutions and that means investing more in Milvus Image Support Images often benefit from metadata in recommender systems and we intend on enhancing Feast in this space, in part because the benefits for RAG are very clear Scaling Batch with Spark and Ray We plan to continue to invest in the Spark development experience We plan to add Ray as a new compute engine Latency Improvements We want to make Feast blazing fast and have made significant progress here
  • 13.
    Thank you! Here aresome useful links: Feast RAG Blog Post Feast Documentation Feast Website GitHub Repo with Demo GitHub Demo with Docling Demo