SlideShare a Scribd company logo
1 of 43
Download to read offline
MLOps at OLX
Alexey Grigorev
Principal Data Scientist at OLX Group
Founder at DataTalks.Club
MLOps at OLX
Alexey Grigorev
Principal Data Scientist at OLX Group
Founder at DataTalks.Club
Productionizing ML at OLX
Hello 👋
I’m Alexey.
Plan
● Data Science at OLX
● Way of Working
● MLOps maturity model
● Improvisation
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Smart ranking
● Reducing null-searches
● Query categorization
● Spell checking
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Collaborative filtering
● Item2Vec
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● NSFW detection
● Forbidden items
● Fraud detection
● Duplicate detection
● Chat moderation
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Image quality
● Listing quality
● Deal prediction
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● User segmentation
● Bid optimization
https://tech.olx.com/data-science-at-olx-7c7406d1713f
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
Such description. So much text
https://olx.com
OLX
Such description. So
much text
Such description. So
much text
Such description. So
much text
https://olx.com
OLX
https://olx.com
OLX
https://olx.com
OLX
https://olx.com
OLX
Problems:
● Illegal items
● NSFW content
● Duplicates
● Spam
● Fraud
Content moderation
ML
Such description
So much text
Accept
Reject
Moderation queue
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
ML
Such description
So much text
Accept
Reject
Moderation queue
Automatic
moderation system
Duplicate
detection
Forbidden
items
Other ML
models
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Index listings & images
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Detect duplicates
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Moderate duplicates
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Collect feedback
https://www.slideshare.net/AlexeyGrigorev/fighting-fraud-finding-duplicates-at-scale-highload-2019-191304763
https://tech.olx.com/a-two-step-framework-for-duplicate-detection-fbbe4c905480
https://tech.olx.com/detecting-image-duplicates-at-olx-scale-7f59e4b6aef4
Plan
● Data Science at OLX
● Way of Working
● MLOps maturity model
● Improvisation
A project like this is very complex
We need a team (or multiple teams) to make it work: it’s a joined effort of many
people working together
Roles in teams
● Product Manager (PM)
● Engineering Manager (EM)
● Software Engineers
○ Backend Engineers (BE)
○ Data Engineers (DE)
○ ML Engineer (MLE)
○ Site Reliability Engineers (SRE)
○ Frontend Engineers (FE)
○ Mobile Engineers
● Product Analysts (PA)
● Data Scientists (DS)
Team A
Team B
Team C
Product
PM
PM
PM
Head of
Product
PA
PA
Head of
Analytics
DS
DS
DS
Manager
Data Tech
EM
EM
EM
Head of
Engineering
BE
DE
BE
FE
BE SRE
FE SRE
FE
Matrix structure
Feature teams
● A cross-functional team with experts in different areas
● All work together on one feature/product
● All have the same goal!
● Anyone can work on anything, as long as it helps achieve the goal
PA DS DE BE SRE
EM
PM
Goal setting
● OKRs, set quarterly
● Great alignment tool: other teams know what you’re doing
● Whatever team is doing, should be in line with their OKRs
Example:
● O
○ Catch more fraudsters
● KRs
○ Precision of model A improves from 30% to 60% while staying at the same recall level
○ Model B is tested in 5 key markets
Plan
● Data Science at OLX
● Way of Working
● MLOps maturity model
● Improvisation
MLOps Maturity Levels
● Level 0: No MLOps
● Level 1: DevOps but no MLOps
● Level 2: Automated training
● Level 3: Automated model deployment
● Level 4: Full MLOps automation
https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model
People
● Siloed lone data scientists
● Siloed teams
● Cross-functional teams
Model creation
● Training
○ Laptop
○ AWS Batch
○ AWS Sagemaker
● Experiment tracking
○ Central MLFlow server
● Version controlling
○ Code — always
○ Data — rarely
○ Models — rarely
Model release
● Manual release — for PoCs and less mature teams
● Automatic release via CI/CD (gitlab) — for the rest
● Metric-based automated retraining/release — rarely
● No handover to SWE — DS/team own the full cycle
Application integration
● Unit and integration tests — always
● Rely on software engineers to integrate to OLX backend
● A/B tests — often
Plan
● Data Science at OLX
● Way of Working
● MLOps maturity model
● Improvisation
I’m happy to talk more about
● Processes
● Our data platform
● Experimentation
● Model deployment
● And other things!
@Al_Grigor
agrigorev
DataTalks.Club

More Related Content

What's hot

How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Daniel Zivkovic
 

What's hot (20)

Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
MLOps week 1 intro
MLOps week 1 introMLOps week 1 intro
MLOps week 1 intro
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Fighting fraud: finding duplicates at scale (Highload+ 2019)
Fighting fraud: finding duplicates at scale (Highload+ 2019)Fighting fraud: finding duplicates at scale (Highload+ 2019)
Fighting fraud: finding duplicates at scale (Highload+ 2019)
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at Scale
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
Regulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with TransparencyRegulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with Transparency
 
Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 

Similar to MLOps at OLX

Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
Databricks
 

Similar to MLOps at OLX (20)

Codementor - Data Science at OLX
Codementor - Data Science at OLX Codementor - Data Science at OLX
Codementor - Data Science at OLX
 
Data science at OLX
Data science at OLXData science at OLX
Data science at OLX
 
Machine Learning in Online Marketplaces
Machine Learning in Online MarketplacesMachine Learning in Online Marketplaces
Machine Learning in Online Marketplaces
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
 
When We Spark and When We Don’t: Developing Data and ML Pipelines
When We Spark and When We Don’t: Developing Data and ML PipelinesWhen We Spark and When We Don’t: Developing Data and ML Pipelines
When We Spark and When We Don’t: Developing Data and ML Pipelines
 
AMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplacesAMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplaces
 
Ria Sankar on Building AI Products
Ria Sankar on Building AI ProductsRia Sankar on Building AI Products
Ria Sankar on Building AI Products
 
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
[DSC Europe 22] Why you need to think about MLOps at the beginning of your pr...
 
What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
Business Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at ScaleBusiness Applications of Predictive Modeling at Scale
Business Applications of Predictive Modeling at Scale
 
WIT Salesforce Event_T-fest : Lets get more technical
WIT Salesforce Event_T-fest : Lets get more technicalWIT Salesforce Event_T-fest : Lets get more technical
WIT Salesforce Event_T-fest : Lets get more technical
 
Business process simulations: from GREAT! to good, Razvan Radulian, Sept 2013
Business process simulations: from GREAT! to good, Razvan Radulian, Sept 2013Business process simulations: from GREAT! to good, Razvan Radulian, Sept 2013
Business process simulations: from GREAT! to good, Razvan Radulian, Sept 2013
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 

More from Alexey Grigorev

More from Alexey Grigorev (20)

Data Monitoring with whylogs
Data Monitoring with whylogsData Monitoring with whylogs
Data Monitoring with whylogs
 
Data engineering zoomcamp introduction
Data engineering zoomcamp  introductionData engineering zoomcamp  introduction
Data engineering zoomcamp introduction
 
AI in Fashion - Size & Fit - Nour Karessli
 AI in Fashion - Size & Fit - Nour Karessli AI in Fashion - Size & Fit - Nour Karessli
AI in Fashion - Size & Fit - Nour Karessli
 
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia PavlovaAI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
 
ML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - KubernetesML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - Kubernetes
 
Paradoxes in Data Science
Paradoxes in Data ScienceParadoxes in Data Science
Paradoxes in Data Science
 
ML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learningML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learning
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble LearningML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
 
ML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deploymentML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deployment
 
ML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for ClassificationML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for Classification
 
ML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for ClassificationML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for Classification
 
ML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office HoursML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office Hours
 
ML Zoomcamp 2 - Slides
ML Zoomcamp 2 - SlidesML Zoomcamp 2 - Slides
ML Zoomcamp 2 - Slides
 
ML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction ProjectML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction Project
 
ML Zoomcamp 1.10 - Summary
ML Zoomcamp 1.10 - SummaryML Zoomcamp 1.10 - Summary
ML Zoomcamp 1.10 - Summary
 
ML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra RefresherML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra Refresher
 
ML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection ProcessML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection Process
 
ML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DMML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DM
 
ML Zoomcamp 1.3 - Supervised Machine Learning
ML Zoomcamp 1.3 - Supervised Machine LearningML Zoomcamp 1.3 - Supervised Machine Learning
ML Zoomcamp 1.3 - Supervised Machine Learning
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 

MLOps at OLX

  • 1. MLOps at OLX Alexey Grigorev Principal Data Scientist at OLX Group Founder at DataTalks.Club
  • 2. MLOps at OLX Alexey Grigorev Principal Data Scientist at OLX Group Founder at DataTalks.Club Productionizing ML at OLX
  • 4.
  • 5. Plan ● Data Science at OLX ● Way of Working ● MLOps maturity model ● Improvisation
  • 6. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization
  • 7. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization ● Smart ranking ● Reducing null-searches ● Query categorization ● Spell checking
  • 8. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization ● Collaborative filtering ● Item2Vec
  • 9. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization ● NSFW detection ● Forbidden items ● Fraud detection ● Duplicate detection ● Chat moderation
  • 10. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization ● Image quality ● Listing quality ● Deal prediction
  • 11. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization ● User segmentation ● Bid optimization
  • 13. Data Science at OLX Main areas: ● Search ● Recommendations ● Trust & Safety ● Seller Experience ● Monetization
  • 14. Such description. So much text https://olx.com OLX
  • 15. Such description. So much text Such description. So much text Such description. So much text https://olx.com OLX
  • 19. Problems: ● Illegal items ● NSFW content ● Duplicates ● Spam ● Fraud
  • 20. Content moderation ML Such description So much text Accept Reject Moderation queue MP Automatic moderation system Moderation panel Accept Reject Moderators
  • 21. ML Such description So much text Accept Reject Moderation queue Automatic moderation system Duplicate detection Forbidden items Other ML models
  • 22. ML Such description So much text MP Automatic moderation system Moderation panel Accept Reject Moderators s3 ES Duplicate detection system Hashes Accept Reject Moderation queue
  • 23. ML Such description So much text MP Automatic moderation system Moderation panel Accept Reject Moderators s3 ES Duplicate detection system Hashes Accept Reject Moderation queue Index listings & images
  • 24. ML Such description So much text MP Automatic moderation system Moderation panel Accept Reject Moderators s3 ES Duplicate detection system Hashes Accept Reject Moderation queue Detect duplicates
  • 25. ML Such description So much text MP Automatic moderation system Moderation panel Accept Reject Moderators s3 ES Duplicate detection system Hashes Accept Reject Moderation queue Moderate duplicates
  • 26. ML Such description So much text MP Automatic moderation system Moderation panel Accept Reject Moderators s3 ES Duplicate detection system Hashes Accept Reject Moderation queue Collect feedback
  • 29. Plan ● Data Science at OLX ● Way of Working ● MLOps maturity model ● Improvisation
  • 30. A project like this is very complex We need a team (or multiple teams) to make it work: it’s a joined effort of many people working together
  • 31. Roles in teams ● Product Manager (PM) ● Engineering Manager (EM) ● Software Engineers ○ Backend Engineers (BE) ○ Data Engineers (DE) ○ ML Engineer (MLE) ○ Site Reliability Engineers (SRE) ○ Frontend Engineers (FE) ○ Mobile Engineers ● Product Analysts (PA) ● Data Scientists (DS)
  • 32. Team A Team B Team C Product PM PM PM Head of Product PA PA Head of Analytics DS DS DS Manager Data Tech EM EM EM Head of Engineering BE DE BE FE BE SRE FE SRE FE Matrix structure
  • 33. Feature teams ● A cross-functional team with experts in different areas ● All work together on one feature/product ● All have the same goal! ● Anyone can work on anything, as long as it helps achieve the goal PA DS DE BE SRE EM PM
  • 34. Goal setting ● OKRs, set quarterly ● Great alignment tool: other teams know what you’re doing ● Whatever team is doing, should be in line with their OKRs Example: ● O ○ Catch more fraudsters ● KRs ○ Precision of model A improves from 30% to 60% while staying at the same recall level ○ Model B is tested in 5 key markets
  • 35. Plan ● Data Science at OLX ● Way of Working ● MLOps maturity model ● Improvisation
  • 36. MLOps Maturity Levels ● Level 0: No MLOps ● Level 1: DevOps but no MLOps ● Level 2: Automated training ● Level 3: Automated model deployment ● Level 4: Full MLOps automation https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model
  • 37. People ● Siloed lone data scientists ● Siloed teams ● Cross-functional teams
  • 38. Model creation ● Training ○ Laptop ○ AWS Batch ○ AWS Sagemaker ● Experiment tracking ○ Central MLFlow server ● Version controlling ○ Code — always ○ Data — rarely ○ Models — rarely
  • 39. Model release ● Manual release — for PoCs and less mature teams ● Automatic release via CI/CD (gitlab) — for the rest ● Metric-based automated retraining/release — rarely ● No handover to SWE — DS/team own the full cycle
  • 40. Application integration ● Unit and integration tests — always ● Rely on software engineers to integrate to OLX backend ● A/B tests — often
  • 41. Plan ● Data Science at OLX ● Way of Working ● MLOps maturity model ● Improvisation
  • 42. I’m happy to talk more about ● Processes ● Our data platform ● Experimentation ● Model deployment ● And other things!