SlideShare a Scribd company logo
1 of 19
Download to read offline
Real-time Machine Learning
with Hopsworks
An integrated Feature Store and Model Serving
platform
Jim Dowling - CEO
ML Operational Capabilities
Business
Value
Online predictions
Batch updates
Offline predictions
Batch updates
Traditional
Analytics
Training/Test Data
Analytical ML
Operational ML
Real-Time
Machine Learning
Where business value is generated in AI
Online inference
Batch features
Offline inference
Batch features
Model Serving
Online Feature Store
Batch jobs
Offline Feature Store
Model Serving
Online Feature Store
Online inference
Streaming features
Online predictions
Real-time updates
Data
warehouse
Applications
-
Services
Search, Versioning, Statistics, Code
Lineage, Provenance
Feature Views
Model Registry
Feature Groups
Online
Applications &
Services
KServe
Feature Store Models
Where Feature Stores and Model Serving meet
Feature
Groups
Feature
Views
Batch
(DataFrames)
Read Feature Vectors
Online API
Read Files/DataFrames
Offline API
Streaming
(Data Instances)
Models
Feature Store
Transformer Prediction
Service
Predictor
Model
Artifact
Online Predictions
REST API
Model Registry
Deploy
Inference logs
(Data Instances)
Model Serving
Code
Model
files
Model Server
Inference Logger
From Raw Data to Online Predictions
Search, Versioning, Statistics, Transformations
Lineage, Provenance
Versioning, Experiments, Metrics, Code Canary, A/B Testing
Keeping Your Pipelines on Track
Model
Registry
Batch Apps
Online Apps
Feature Groups
Feature Views
Vector DB
Training
Pipelines
Inference
Pipelines
Online
Offline
Model
artifact
Index Creation
Encoder
schema
transformation
functions
versioning
versioning
versioning
experiments
versioning
schema
schema
schema
✓ Versioning →
■ code : feature eng., transformation functions, model training, model serving scripts
■ assets: model files, model artifacts, experiments
■ configuration: experiment settings, deployments, indexes
✓ Schema management → columns, data types // fg, fv, models, deployments
✓ Transformation functions → avoid training / serving skew
✓ Provenance and Lineage → track predictions down to the ingested features
Provenance
versioning
Data warehouse
(historical data)
Applications, Service
(context, trends)
Feature
Pipelines
Batch
Streaming
versioning
A Closer Look to Inference Pipelines
Data warehouse
(historical data)
Model
Registry
Batch Apps
Online Apps
Feature Groups
Feature Views
Applications, Service
(context, trends)
Feature
Pipelines
Vector DB
Batch
Streaming
Training
Pipelines
Offline
Index Creation
Encoder
Model artifact
Batch Inference Jobs
Prediction Service
Transformer
Predictor
Model artifact
Online
Recent
features
Embeddings
Online
predictions
Inference logs
Inference logger
Batch data
Batch
predictions
Feature Store
Inference Request
Streaming
Feature Pipeline
Feature Group
FG 1
FG 2
FG 4
FG 3
Feature View
FV 1
FV 2
FG 5
FV 3
Features
Feature 1
Feature 2
Feature 4 (pk)
Feature 3
Feature 5
Feature 6
Feature 7 (pk)
Feature 9
Feature 8
Model Serving
Transformer
Feature 4 (pk)
Feature Vector
Vector DB
Embedding
Embedding
Embedding
Embeddings
Predictor
Embeddings
Model Input
Inference Response
Prediction
Prediction
Embedding space
Online Apps
Similarity
search
Feedback
Lookups
Inference logs
Model
A Deeper Look to Real-time Inference Pipelines
mapping
Real-Time, Personalized
Recommendation Systems
Candidate Retrieval and Ranking
Embedding
User-Query
Encoder
Features
Embeddings compress high dimensional data, retaining semantic relationships
current user search
user session data
user purchases
user profile
What about Multi-Modal Similarity Search?
Can a “user query” find “items”
with similarity search?
Yes, by mapping the “user query” embedding
into the “item” embedding space with a
two-tower model.
Representation learning for retrieval usually involves supervised learning with labeled or
pseudo-labeled data from user-item interactions.
Training data for our Two-Tower Model will be User-Item Interactions
Log user-item interactions as training data for our two-tower model and ranking model.
Retail Website
Search
Item 1
Item 2
Item 3
Item 4
Purchase 3
Click 2
Click 3
Score: 0
Item 1
Score: 1
Item 2
Score: 5
Item 3
Score: 0
Item 4
Features
Features
Features
Features
Training the Two-Tower Embedding Modoel
User Query
embeddings
User Query
encoder
Item
embeddings
Item encoder
Item category,
price, popularity,
etc
User features,
preferences,
history
Dot product
(Loss fn)
0 → Non-interaction
LOSS
1 → highest interaction
User-Item
Interactions
Training Data
Model Training for Embedding Models and Ranking Model
Feature Views
items
user queries
Feature Store
Training Data
retrieval.csv
ranking.csv
Ranking
User/Query
Embedding
Item
Embedding
Hopsworks Model Registry
Train Models Train Models
Models
item user clicks
Build the ANN Index on Items. Similarity Search with user queries on it.
OpenSearch k-NN
(ANN Index)
items.csv
Job computes
embeddings for all
Items
Encode all items
Insert all pairs
(item-ID, embedding)
Two-Tower Network with a Vector Database for ANN Search
Source: https://cloud.google.com/blog/products/ai-machine-learning/vertex-matching-engine-blazing-fast-and-massively-scalable-nearest-neighbor-search
Retrieval and Ranking for Personalized Real-time Recommendation Systems
User-Query
Embedding
User-Query
Encoder
Features
Candidate
Retrieval
Ranking
Model
Ranked items
Hopsworks
Feature Store
OpenSearch k-NN
(items)
Candidate
Items
Trends,
Feedback
Search
Get
Features
for
items
Features
Real-time Recommendation Systems
Query
Model
Retrieve closest
candidates using
similarity search
Enrich with
features for
candidates
Ranked
candidates
Recommended
candidates
Ranking
Model
Candidate 1
Candidate 2
Candidate N
Recommendation
request
Enrich with
item/user features
Real-time Recommendation Systems with Hopsworks
User
Query
Model
Retrieve closest
candidates with
similarity search
Enrich with
features for
candidates
Recommendation
request
Recommended
candidates
Enrich with
item/user features
Ranking
Model
Ranked
candidates
Candidate 1
Candidate 2
Candidate N
Hopsworks Feature Store
Predictor Predictor
KServe
Deployment
OpenSearch K-NN
KServe
Deployment
Transformer
Transformer
Extended Retrieval and Ranking Architecture
Embeddings, Retrieval, Filtering, Ranking
Jointly train with
two-tower model:
User/query embedding
Item embedding models
Built Approx Nearest
Neighbor (ANN) Index
with items and item
embedding model.
User/Query &
Item Embeddings
With a ranking model,
score all the candidate
items with both user
and item features,
ensuring, candidate
diversity.
Ranking
Remove candidate
items for various
reasons:
• underage user
• item sold out
• item bought
before
• item not available
in user’s region
Filtering
Retrieve candidate
items based on the user
embedding from the
ANN Index -
similarity search
Retrieval

More Related Content

What's hot

카카오의 광고지능 (Intelligence on Kakao Advertising)
카카오의 광고지능 (Intelligence on Kakao Advertising)카카오의 광고지능 (Intelligence on Kakao Advertising)
카카오의 광고지능 (Intelligence on Kakao Advertising)
if kakao
 

What's hot (20)

Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
 
Analýza zákazníků v E-commerce | Praktický návod "Jak analýzu akvizice a rete...
Analýza zákazníků v E-commerce | Praktický návod "Jak analýzu akvizice a rete...Analýza zákazníků v E-commerce | Praktický návod "Jak analýzu akvizice a rete...
Analýza zákazníků v E-commerce | Praktický návod "Jak analýzu akvizice a rete...
 
카카오의 광고지능 (Intelligence on Kakao Advertising)
카카오의 광고지능 (Intelligence on Kakao Advertising)카카오의 광고지능 (Intelligence on Kakao Advertising)
카카오의 광고지능 (Intelligence on Kakao Advertising)
 
Google Optimize for testing and personalization
Google Optimize for testing and personalizationGoogle Optimize for testing and personalization
Google Optimize for testing and personalization
 
How to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceHow to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-Source
 
Operation Types in Odoo 13
Operation Types in Odoo 13Operation Types in Odoo 13
Operation Types in Odoo 13
 
Adrián Garrido - 10 Diferencias entre GAU y GA4.pdf
Adrián Garrido - 10 Diferencias entre GAU y GA4.pdfAdrián Garrido - 10 Diferencias entre GAU y GA4.pdf
Adrián Garrido - 10 Diferencias entre GAU y GA4.pdf
 
Ako na kampane pre e-shopy v Google Ads
Ako na kampane pre e-shopy v Google AdsAko na kampane pre e-shopy v Google Ads
Ako na kampane pre e-shopy v Google Ads
 
Multiple Work Center/Resource Selection and Classification in Master Recipe f...
Multiple Work Center/Resource Selection and Classification in Master Recipe f...Multiple Work Center/Resource Selection and Classification in Master Recipe f...
Multiple Work Center/Resource Selection and Classification in Master Recipe f...
 
Odoo - From v7 to v8: the new api
Odoo - From v7 to v8: the new apiOdoo - From v7 to v8: the new api
Odoo - From v7 to v8: the new api
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modeling
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
A Prelude of Purity: Scaling Back ZIO
A Prelude of Purity: Scaling Back ZIOA Prelude of Purity: Scaling Back ZIO
A Prelude of Purity: Scaling Back ZIO
 
201124772 sap-pp-pi-process-flow-docs
201124772 sap-pp-pi-process-flow-docs201124772 sap-pp-pi-process-flow-docs
201124772 sap-pp-pi-process-flow-docs
 
Prezentace 13. PPC camp - GA4 tipy a triky pro PPCčkaře
Prezentace 13. PPC camp - GA4 tipy a triky pro PPCčkařePrezentace 13. PPC camp - GA4 tipy a triky pro PPCčkaře
Prezentace 13. PPC camp - GA4 tipy a triky pro PPCčkaře
 
S4hana pp
S4hana ppS4hana pp
S4hana pp
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
SAP HANA SPS08 Security
SAP HANA SPS08 SecuritySAP HANA SPS08 Security
SAP HANA SPS08 Security
 
PPC Restart 2023: Ladislav Vitouš - AI pro PPC: Mezi hypem a realitou
PPC Restart 2023: Ladislav Vitouš - AI pro PPC: Mezi hypem a realitouPPC Restart 2023: Ladislav Vitouš - AI pro PPC: Mezi hypem a realitou
PPC Restart 2023: Ladislav Vitouš - AI pro PPC: Mezi hypem a realitou
 
SAP SD Training
SAP SD Training SAP SD Training
SAP SD Training
 

Similar to Real-time Machine Learning with Hopsworks

test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
Skyl.ai
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 

Similar to Real-time Machine Learning with Hopsworks (20)

Contextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & ExperiencesContextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & Experiences
 
Building Intelligent Apps with MongoDB & Google Cloud
Building Intelligent Apps with MongoDB & Google CloudBuilding Intelligent Apps with MongoDB & Google Cloud
Building Intelligent Apps with MongoDB & Google Cloud
 
MongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google Cloud
 
Building Intelligent Apps with MongoDB and Google Cloud - Jane Fine
Building Intelligent Apps with MongoDB and Google Cloud - Jane FineBuilding Intelligent Apps with MongoDB and Google Cloud - Jane Fine
Building Intelligent Apps with MongoDB and Google Cloud - Jane Fine
 
MongoDB.local DC 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local DC 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB.local DC 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local DC 2018: Building Intelligent Apps with MongoDB & Google Cloud
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
MongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google CloudMongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google Cloud
 
Models in Minutes using AutoML
Models in Minutes using AutoMLModels in Minutes using AutoML
Models in Minutes using AutoML
 
Wix Machine Learning - Ran Romano
Wix Machine Learning - Ran RomanoWix Machine Learning - Ran Romano
Wix Machine Learning - Ran Romano
 
#TDXRecap India tour
#TDXRecap India tour#TDXRecap India tour
#TDXRecap India tour
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
 
CCCDjango2010.pdf
CCCDjango2010.pdfCCCDjango2010.pdf
CCCDjango2010.pdf
 
Telecom datascience master_public
Telecom datascience master_publicTelecom datascience master_public
Telecom datascience master_public
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
 
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
 
How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

Real-time Machine Learning with Hopsworks

  • 1. Real-time Machine Learning with Hopsworks An integrated Feature Store and Model Serving platform Jim Dowling - CEO
  • 2. ML Operational Capabilities Business Value Online predictions Batch updates Offline predictions Batch updates Traditional Analytics Training/Test Data Analytical ML Operational ML Real-Time Machine Learning Where business value is generated in AI Online inference Batch features Offline inference Batch features Model Serving Online Feature Store Batch jobs Offline Feature Store Model Serving Online Feature Store Online inference Streaming features Online predictions Real-time updates
  • 3. Data warehouse Applications - Services Search, Versioning, Statistics, Code Lineage, Provenance Feature Views Model Registry Feature Groups Online Applications & Services KServe Feature Store Models Where Feature Stores and Model Serving meet
  • 4. Feature Groups Feature Views Batch (DataFrames) Read Feature Vectors Online API Read Files/DataFrames Offline API Streaming (Data Instances) Models Feature Store Transformer Prediction Service Predictor Model Artifact Online Predictions REST API Model Registry Deploy Inference logs (Data Instances) Model Serving Code Model files Model Server Inference Logger From Raw Data to Online Predictions Search, Versioning, Statistics, Transformations Lineage, Provenance Versioning, Experiments, Metrics, Code Canary, A/B Testing
  • 5. Keeping Your Pipelines on Track Model Registry Batch Apps Online Apps Feature Groups Feature Views Vector DB Training Pipelines Inference Pipelines Online Offline Model artifact Index Creation Encoder schema transformation functions versioning versioning versioning experiments versioning schema schema schema ✓ Versioning → ■ code : feature eng., transformation functions, model training, model serving scripts ■ assets: model files, model artifacts, experiments ■ configuration: experiment settings, deployments, indexes ✓ Schema management → columns, data types // fg, fv, models, deployments ✓ Transformation functions → avoid training / serving skew ✓ Provenance and Lineage → track predictions down to the ingested features Provenance versioning Data warehouse (historical data) Applications, Service (context, trends) Feature Pipelines Batch Streaming versioning
  • 6. A Closer Look to Inference Pipelines Data warehouse (historical data) Model Registry Batch Apps Online Apps Feature Groups Feature Views Applications, Service (context, trends) Feature Pipelines Vector DB Batch Streaming Training Pipelines Offline Index Creation Encoder Model artifact Batch Inference Jobs Prediction Service Transformer Predictor Model artifact Online Recent features Embeddings Online predictions Inference logs Inference logger Batch data Batch predictions
  • 7. Feature Store Inference Request Streaming Feature Pipeline Feature Group FG 1 FG 2 FG 4 FG 3 Feature View FV 1 FV 2 FG 5 FV 3 Features Feature 1 Feature 2 Feature 4 (pk) Feature 3 Feature 5 Feature 6 Feature 7 (pk) Feature 9 Feature 8 Model Serving Transformer Feature 4 (pk) Feature Vector Vector DB Embedding Embedding Embedding Embeddings Predictor Embeddings Model Input Inference Response Prediction Prediction Embedding space Online Apps Similarity search Feedback Lookups Inference logs Model A Deeper Look to Real-time Inference Pipelines mapping
  • 9. Embedding User-Query Encoder Features Embeddings compress high dimensional data, retaining semantic relationships current user search user session data user purchases user profile
  • 10. What about Multi-Modal Similarity Search? Can a “user query” find “items” with similarity search? Yes, by mapping the “user query” embedding into the “item” embedding space with a two-tower model. Representation learning for retrieval usually involves supervised learning with labeled or pseudo-labeled data from user-item interactions.
  • 11. Training data for our Two-Tower Model will be User-Item Interactions Log user-item interactions as training data for our two-tower model and ranking model. Retail Website Search Item 1 Item 2 Item 3 Item 4 Purchase 3 Click 2 Click 3 Score: 0 Item 1 Score: 1 Item 2 Score: 5 Item 3 Score: 0 Item 4 Features Features Features Features
  • 12. Training the Two-Tower Embedding Modoel User Query embeddings User Query encoder Item embeddings Item encoder Item category, price, popularity, etc User features, preferences, history Dot product (Loss fn) 0 → Non-interaction LOSS 1 → highest interaction User-Item Interactions Training Data
  • 13. Model Training for Embedding Models and Ranking Model Feature Views items user queries Feature Store Training Data retrieval.csv ranking.csv Ranking User/Query Embedding Item Embedding Hopsworks Model Registry Train Models Train Models Models item user clicks
  • 14. Build the ANN Index on Items. Similarity Search with user queries on it. OpenSearch k-NN (ANN Index) items.csv Job computes embeddings for all Items Encode all items Insert all pairs (item-ID, embedding)
  • 15. Two-Tower Network with a Vector Database for ANN Search Source: https://cloud.google.com/blog/products/ai-machine-learning/vertex-matching-engine-blazing-fast-and-massively-scalable-nearest-neighbor-search
  • 16. Retrieval and Ranking for Personalized Real-time Recommendation Systems User-Query Embedding User-Query Encoder Features Candidate Retrieval Ranking Model Ranked items Hopsworks Feature Store OpenSearch k-NN (items) Candidate Items Trends, Feedback Search Get Features for items Features
  • 17. Real-time Recommendation Systems Query Model Retrieve closest candidates using similarity search Enrich with features for candidates Ranked candidates Recommended candidates Ranking Model Candidate 1 Candidate 2 Candidate N Recommendation request Enrich with item/user features
  • 18. Real-time Recommendation Systems with Hopsworks User Query Model Retrieve closest candidates with similarity search Enrich with features for candidates Recommendation request Recommended candidates Enrich with item/user features Ranking Model Ranked candidates Candidate 1 Candidate 2 Candidate N Hopsworks Feature Store Predictor Predictor KServe Deployment OpenSearch K-NN KServe Deployment Transformer Transformer
  • 19. Extended Retrieval and Ranking Architecture Embeddings, Retrieval, Filtering, Ranking Jointly train with two-tower model: User/query embedding Item embedding models Built Approx Nearest Neighbor (ANN) Index with items and item embedding model. User/Query & Item Embeddings With a ranking model, score all the candidate items with both user and item features, ensuring, candidate diversity. Ranking Remove candidate items for various reasons: • underage user • item sold out • item bought before • item not available in user’s region Filtering Retrieve candidate items based on the user embedding from the ANN Index - similarity search Retrieval