SlideShare a Scribd company logo
1 of 37
Download to read offline
Agile Machine Learning for Real-time 
Recommender Systems 
Johann Schleier-Smith 
CTO, if(we) 
@jssmith johann@ifwe.co github.com/ifwe
what it should look like
1. Gain understanding of machine learning 
2. Gain understanding of the product usage 
3. See opportunity to make the product better 
4. Create training data 
5. Train predictive models 
6. Put models in production 
7. See improvements
what it often looks like
1. Gain understanding of machine learning 
2. Gain understanding of the product usage 
3. See opportunity to make the product better 
4. Pull records from database to create interesting 
features (usually aggregates) 
5. Train predictive models 
6. Go implement models for production 
7. See improvements
1. Gain understanding of machine learning 
2. Gain understanding of the product usage 
3. See opportunity to make the product better 
4. Pull records from database to create interesting 
features (usually aggregates) 
5. Train predictive models 
6. Go implement models for production 
7. See improvements 
3-6 
months
1. Gain understanding of machine learning 
2. Gain understanding of the product usage 
3. See opportunity to make the product better 
4. Pull records from database to create interesting 
features (usually aggregates) 
5. Train predictive models 
6. Go implement models for production 
7. See improvements Cool! 
Wa s i t w o r t h i t ?
• Profitable startup actively pursuing big 
opportunities in social apps 
• Millions of users of existing brands 
• Thousands of social contacts per second
real-time 
recommendations 
challenges
Tagged dating feature 
• >10 million candidates 
to select from 
• >1000 updates/sec 
• Must be responsive to 
current activity 
• Users expect instant 
query results
implementation 
pain points
• Data scientist hands model description to 
software engineer 
• May need to translate features from SQL to Java 
• Aggregate features require batch processing 
• May need to adjust features and model to 
achieve real-time updates 
• Fast scoring requires high-performance in-memory 
data structures
time for 
new thinking
one way that 
works better
! 
! 
! 
4. Pull records from database to create interesting 
features (usually aggregates) 
5. Train predictive models 
6. Go implement models for production
Create interesting features 
Train predictive models 
Put models in production
Create interesting features 
Train predictive models 
Put models in production
one right way to data 
event history
History. 
filterTime(start, PLUS_INFINITY). 
foreach { 
e: Event => model.update(e) 
}
everything is an event
Bob registers 
Alice registers 
Alice updates profile 
Bob opens app 
Bob sees Alice in recommendations 
Bob swipes yes on Alice 
Alice receives push notification 
Alice sees Bob swiped yes 
Alice swipes yes 
Alice sends message to Bob
writing the model
class MyModel { 
def update(e: Event) { … } 
def topN(ctx: Context, n: Int) = { … } 
}
models are all 
about features
class MyFeature { 
def update(e: Event) { … } 
def score(ctx: Context, 
candidateId: Long): Double = { … } 
}
model training
History. 
filterTime(start, PLUS_INFINITY). 
foreach { 
e: Event => { 
writeTrainingData(outcome(e), 
model.features(context(e)) 
model.update(e) 
} 
}
live demo 
Kaggle competition 
with Best Buy data 
https://www.kaggle.com/c/acm-sf-chapter-hackathon-small
product update events 
{ 
“timestamp” : “2012-05-03 6:43:15”, 
“eventType” : “ProductUpdate”, 
“eventProperties” : { 
“sku” : “1032361”, 
“regularPrice” : “19.99”, 
“name” : “Need for Speed: Hot Pursuit”, 
“description” : “Fasten your seatbelt and 
get ready to drive like your life depends 
on it...” 
... 
} 
}
product view events 
{ 
“timestamp” : “2011-10-31 09:48:46”, 
“eventType” : “ProductView”, 
“eventProperties” : { 
“skuSelected” : “2670133”, 
“query” : “Modern warfare” 
} 
}
demo 
Try it yourself, code and instructions at: 
https://github.com/ifweco/antelope/blob/master/doc/demo.md
1. Gain understanding of machine learning 
2. Gain understanding of the product usage 
3. See opportunity to make the product better 
4. Create training data 
5. Train predictive models 
6. Put models in production 
7. See improvements 
Fast cycles!!
• All data in form of events – no exceptions! 
• Roll through history to generate training examples 
• Sample training data carefully to avoid feedback 
• Model is static while features are live and personal 
• Use interesting features with boring algorithms 
• Expressiveness > performance > scalability 
github.com/ifwe/antelope 
@jssmith

More Related Content

What's hot

ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationHunter Carlisle
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumSasha Rosenbaum
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)Julien SIMON
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...Bill Liu
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowDatabricks
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusManasi Vartak
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinarSameer Mahajan
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceDatabricks
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesWill Gardella
 
Production machine learning_infrastructure
Production machine learning_infrastructureProduction machine learning_infrastructure
Production machine learning_infrastructurejoshwills
 
Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.Knoldus Inc.
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in productionStepan Pushkarev
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to productionGeorg Heiler
 
Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiSri Ambati
 

What's hot (20)

ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production Application
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha Rosenbaum
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinar
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness Marketplace
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
Production machine learning_infrastructure
Production machine learning_infrastructureProduction machine learning_infrastructure
Production machine learning_infrastructure
 
Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.ai
 

Viewers also liked

Fuzzy clustering using RSIO-FCM ppt
Fuzzy clustering using RSIO-FCM pptFuzzy clustering using RSIO-FCM ppt
Fuzzy clustering using RSIO-FCM pptNIGAN NAYAK
 
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)Parinda Rajapaksha
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFMLconf
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFMLconf
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFMLconf
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFMLconf
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFMLconf
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFMLconf
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...Sri Ambati
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFMLconf
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 

Viewers also liked (16)

Fuzzy clustering using RSIO-FCM ppt
Fuzzy clustering using RSIO-FCM pptFuzzy clustering using RSIO-FCM ppt
Fuzzy clustering using RSIO-FCM ppt
 
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SF
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 

Similar to Agile Machine Learning for Real-time Recommender Systems

BackboneJS Training - Giving Backbone to your applications
BackboneJS Training - Giving Backbone to your applicationsBackboneJS Training - Giving Backbone to your applications
BackboneJS Training - Giving Backbone to your applicationsJoseph Khan
 
[2C2]PredictionIO
[2C2]PredictionIO[2C2]PredictionIO
[2C2]PredictionIONAVER D2
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...Fabio Franzini
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)Jasjeet Thind
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#J On The Beach
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionSplunk
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)AZUG FR
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionSplunk
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsShannon Cuthbertson
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scalapredictionio
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesJim Dowling
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Databricks
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Gülden Bilgütay
 
Agile Experiments in Machine Learning
Agile Experiments in Machine LearningAgile Experiments in Machine Learning
Agile Experiments in Machine Learningmathias-brandewinder
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Neotys_Partner
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionSplunk
 

Similar to Agile Machine Learning for Real-time Recommender Systems (20)

BackboneJS Training - Giving Backbone to your applications
BackboneJS Training - Giving Backbone to your applicationsBackboneJS Training - Giving Backbone to your applications
BackboneJS Training - Giving Backbone to your applications
 
[2C2]PredictionIO
[2C2]PredictionIO[2C2]PredictionIO
[2C2]PredictionIO
 
WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...WebNet Conference 2012 - Designing complex applications using html5 and knock...
WebNet Conference 2012 - Designing complex applications using html5 and knock...
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and Analytics
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and Analytics
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scala
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 
Agile Experiments in Machine Learning
Agile Experiments in Machine LearningAgile Experiments in Machine Learning
Agile Experiments in Machine Learning
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 

Recently uploaded

Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 

Recently uploaded (20)

Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 

Agile Machine Learning for Real-time Recommender Systems

  • 1. Agile Machine Learning for Real-time Recommender Systems Johann Schleier-Smith CTO, if(we) @jssmith johann@ifwe.co github.com/ifwe
  • 2. what it should look like
  • 3. 1. Gain understanding of machine learning 2. Gain understanding of the product usage 3. See opportunity to make the product better 4. Create training data 5. Train predictive models 6. Put models in production 7. See improvements
  • 4. what it often looks like
  • 5. 1. Gain understanding of machine learning 2. Gain understanding of the product usage 3. See opportunity to make the product better 4. Pull records from database to create interesting features (usually aggregates) 5. Train predictive models 6. Go implement models for production 7. See improvements
  • 6. 1. Gain understanding of machine learning 2. Gain understanding of the product usage 3. See opportunity to make the product better 4. Pull records from database to create interesting features (usually aggregates) 5. Train predictive models 6. Go implement models for production 7. See improvements 3-6 months
  • 7. 1. Gain understanding of machine learning 2. Gain understanding of the product usage 3. See opportunity to make the product better 4. Pull records from database to create interesting features (usually aggregates) 5. Train predictive models 6. Go implement models for production 7. See improvements Cool! Wa s i t w o r t h i t ?
  • 8. • Profitable startup actively pursuing big opportunities in social apps • Millions of users of existing brands • Thousands of social contacts per second
  • 10. Tagged dating feature • >10 million candidates to select from • >1000 updates/sec • Must be responsive to current activity • Users expect instant query results
  • 12. • Data scientist hands model description to software engineer • May need to translate features from SQL to Java • Aggregate features require batch processing • May need to adjust features and model to achieve real-time updates • Fast scoring requires high-performance in-memory data structures
  • 13. time for new thinking
  • 14. one way that works better
  • 15. ! ! ! 4. Pull records from database to create interesting features (usually aggregates) 5. Train predictive models 6. Go implement models for production
  • 16. Create interesting features Train predictive models Put models in production
  • 17. Create interesting features Train predictive models Put models in production
  • 18. one right way to data event history
  • 19. History. filterTime(start, PLUS_INFINITY). foreach { e: Event => model.update(e) }
  • 21. Bob registers Alice registers Alice updates profile Bob opens app Bob sees Alice in recommendations Bob swipes yes on Alice Alice receives push notification Alice sees Bob swiped yes Alice swipes yes Alice sends message to Bob
  • 23. class MyModel { def update(e: Event) { … } def topN(ctx: Context, n: Int) = { … } }
  • 24. models are all about features
  • 25. class MyFeature { def update(e: Event) { … } def score(ctx: Context, candidateId: Long): Double = { … } }
  • 27. History. filterTime(start, PLUS_INFINITY). foreach { e: Event => { writeTrainingData(outcome(e), model.features(context(e)) model.update(e) } }
  • 28. live demo Kaggle competition with Best Buy data https://www.kaggle.com/c/acm-sf-chapter-hackathon-small
  • 29. product update events { “timestamp” : “2012-05-03 6:43:15”, “eventType” : “ProductUpdate”, “eventProperties” : { “sku” : “1032361”, “regularPrice” : “19.99”, “name” : “Need for Speed: Hot Pursuit”, “description” : “Fasten your seatbelt and get ready to drive like your life depends on it...” ... } }
  • 30. product view events { “timestamp” : “2011-10-31 09:48:46”, “eventType” : “ProductView”, “eventProperties” : { “skuSelected” : “2670133”, “query” : “Modern warfare” } }
  • 31. demo Try it yourself, code and instructions at: https://github.com/ifweco/antelope/blob/master/doc/demo.md
  • 32.
  • 33.
  • 34.
  • 35. 1. Gain understanding of machine learning 2. Gain understanding of the product usage 3. See opportunity to make the product better 4. Create training data 5. Train predictive models 6. Put models in production 7. See improvements Fast cycles!!
  • 36.
  • 37. • All data in form of events – no exceptions! • Roll through history to generate training examples • Sample training data carefully to avoid feedback • Model is static while features are live and personal • Use interesting features with boring algorithms • Expressiveness > performance > scalability github.com/ifwe/antelope @jssmith