SlideShare a Scribd company logo
LessonsLearned
from building real-life Machine Learning Systems
Xavier Amatriain (@xamat)
www.quora.com/profile/Xavier-Amatriain
3/29/16
A bit about
Our Mission
“To share and grow
the world’s knowledge”
• Millions of questions & answers
• Millions of users
• Thousands of topics
• ...
Demand
What we care about
Quality
Relevance
LessonsLearned
MoreDatavs.BetterModels
More data or better models?
Really?
Anand Rajaraman: VC, Founder, Stanford Professor
More data or better models?
Sometimes, it’s
not about more
data
More data or better models?
Norvig:
“Google does not have
better Algorithms only
more Data”
Many features/
low-bias models
More data or better models?
Sometimes, it’s
not about more
data
Sometimesyoudoneed
A(more)ComplexModel
Better models and features that “don’t work”
● E.g. You have a linear model and have
been selecting and optimizing features
for that model
■ More complex model with the same features
-> improvement not likely
■ More expressive features with the same model
-> improvement not likely
● More complex features may require a
more complex model
● A more complex model may not show
improvements with a feature set that is
too simple
Modelselectionisalsoabout
Hyperparameteroptimization
Hyperparameter optimization
● Automate hyperparameter
optimization by choosing the
right metric.
○ But, is it as simple as choosing the
max?
● Bayesian Optimization
(Gaussian Processes) better
than grid search
○ See spearmint, hyperopt, AutoML,
MOE...
Supervisedvs.plus
UnsupervisedLearning
Supervised/Unsupervised Learning
● Unsupervised learning as dimensionality reduction
● Unsupervised learning as feature engineering
● The “magic” behind combining
unsupervised/supervised learning
○ E.g.1 clustering + knn
○ E.g.2 Matrix Factorization
■ MF can be interpreted as
● Unsupervised:
○ Dimensionality Reduction a la PCA
○ Clustering (e.g. NMF)
● Supervised
○ Labeled targets ~ regression
Supervised/Unsupervised Learning
● One of the “tricks” in Deep Learning is how it
combines unsupervised/supervised learning
○ E.g. Stacked Autoencoders
○ E.g. training of convolutional nets
Everythingisanensemble
Ensembles
● Netflix Prize was won by an ensemble
○ Initially Bellkor was using GDBTs
○ BigChaos introduced ANN-based ensemble
● Most practical applications of ML run an ensemble
○ Why wouldn’t you?
○ At least as good as the best of your methods
○ Can add completely different approaches (e.
g. CF and content-based)
○ You can use many different models at the
ensemble layer: LR, GDBTs, RFs, ANNs...
Ensembles & Feature Engineering
● Ensembles are the way to turn any model into a feature!
● E.g. Don’t know if the way to go is to use Factorization
Machines, Tensor Factorization, or RNNs?
○ Treat each model as a “feature”
○ Feed them into an ensemble
The Master Algorithm?
It definitely is the ensemble!
Thepains&gains
ofFeatureEngineering
Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
● Reusability: You should be able to reuse features in different
models, applications, and teams
● Transformability: Besides directly reusing a feature, it
should be easy to use a transformation of it (e.g. log(f), max(f),
∑ft
over a time window…)
Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
● Interpretability: In order to do any of the previous, you
need to be able to understand the meaning of features and
interpret their values.
● Reliability: It should be easy to monitor and detect bugs/issues
in features
Feature Engineering Example - Quora Answer Ranking
What is a good Quora answer?
• truthful
• reusable
• provides explanation
• well formatted
• ...
Feature Engineering Example - Quora Answer Ranking
How are those dimensions translated
into features?
• Features that relate to the answer
quality itself
• Interaction features
(upvotes/downvotes, clicks,
comments…)
• User features (e.g. expertise in topic)
Implicitsignalsbeat
explicitones
(almostalways)
Implicit vs. Explicit
● Many have acknowledged
that implicit feedback is more useful
● Is implicit feedback really always
more useful?
● If so, why?
● Implicit data is (usually):
○ More dense, and available for all users
○ Better representative of user behavior vs.
user reflection
○ More related to final objective function
○ Better correlated with AB test results
● E.g. Rating vs watching
Implicit vs. Explicit
● However
○ It is not always the case that
direct implicit feedback correlates
well with long-term retention
○ E.g. clickbait
● Solution:
○ Combine different forms of
implicit + explicit to better represent
long-term goal
Implicit vs. Explicit
bethoughtfulaboutyour
TrainingData
Defining training/testing data
● Training a simple binary classifier for good/bad
answer
○ Defining positive and negative labels ->
Non-trivial task
○ Is this a positive or a negative?
● funny uninformative answer with many upvotes
● short uninformative answer by a well-known
expert in the field
● very long informative answer that nobody
reads/upvotes
● informative answer with grammar/spelling
mistakes
● ...
Other training data issues: Time traveling
● Time traveling: usage of features that originated after the
event you are trying to predict
○ E.g. Your upvoting an answer is a pretty good prediction
of you reading that answer, especially because most
upvotes happen AFTER you read the answer
○ Tricky when you have many related features
○ Whenever I see an offline experiment with huge wins, I
ask: “Is there time traveling?”
YourModelwilllearn
whatyouteachittolearn
Training a model
● Model will learn according to:
○ Training data (e.g. implicit and explicit)
○ Target function (e.g. probability of user reading an answer)
○ Metric (e.g. precision vs. recall)
● Example 1 (made up):
○ Optimize probability of a user going to the cinema to
watch a movie and rate it “highly” by using purchase history
and previous ratings. Use NDCG of the ranking as final
metric using only movies rated 4 or higher as positives.
Example 2 - Quora’s feed
● Training data = implicit + explicit
● Target function: Value of showing a story to a
user ~ weighted sum of actions: v = ∑a
va
1{ya
= 1}
○ predict probabilities for each action, then compute expected
value: v_pred = E[ V | x ] = ∑a
va
p(a | x)
● Metric: any ranking metric
Offline testing
● Measure model performance,
using (IR) metrics
● Offline performance = indication
to make decisions on follow-up
A/B tests
● A critical (and mostly unsolved)
issue is how offline metrics
correlate with A/B test results.
Learntodealwith
PresentationBias
2D Navigational modeling
More likely
to see
Less likely
The curse of presentation bias
● User can only click on what you decide to show
● But, what you decide to show is the result of what your model
predicted is good
● Simply treating things you show as negatives is not likely to work
● Better options
● Correcting for the probability a user will click on a position ->
Attention models
● Explore/exploit approaches such as MAB
Youdon’tneedtodistribute
yourMLalgorithm
Distributing ML
● Most of what people do in practice can fit into a multi-
core machine
○ Smart data sampling
○ Offline schemes
○ Efficient parallel code
● Dangers of “easy” distributed approaches such
as Hadoop/Spark
● Do you care about costs? How about latencies?
Distributing ML
● Example of optimizing computations to fit them into
one machine
○ Spark implementation: 6 hours, 15 machines
○ Developer time: 4 days
○ C++ implementation: 10 minutes, 1 machine
● Most practical applications of Big Data can fit into
a (multicore) implementation
Theuntoldstoryof
DataScienceandvs.MLengineering
Data Scientists and ML Engineers
● We all know the definition of a Data Scientist
● Where do Data Scientists fit in an organization?
○ Many companies struggling with this
● Valuable to have strong DS who can bring value
from the data
● Strong DS with solid engineering skills are
unicorns and finding them is not scalable
○ DS need engineers to bring things to production
○ Engineers have enough on their plate to be willing to
“productionize” cool DS projects
The data-driven ML innovation funnel
Data Research
ML Exploration -
Product Design
AB Testing
Data Scientists and ML Engineers
● Solution:
○ (1) Define different parts of the innovation funnel
■ Part 1. Data research & hypothesis
building -> Data Science
■ Part 2. ML solution building &
implementation -> ML Engineering
■ Part 3. Online experimentation, AB
Testing analysis-> Data Science
○ (2) Broaden the definition of ML Engineers
to include from coding experts with high-level
ML knowledge to ML experts with good
software skills
Data Research
ML Solution
AB Testing
Data
Science
Data
Science
ML
Engineering
Conclusions
● In data, size is not all that matters
● Understand dependencies between data, models
& systems
● Choose the right metric & optimize what matters
● Be thoughtful about
○ your ML infrastructure/tools
○ about organizing your teams
Questions?
Strata 2016 -  Lessons Learned from building real-life Machine Learning Systems

More Related Content

What's hot

RecSys 2016 Talk: Feature Selection For Human Recommenders
RecSys 2016 Talk: Feature Selection For Human RecommendersRecSys 2016 Talk: Feature Selection For Human Recommenders
RecSys 2016 Talk: Feature Selection For Human Recommenders
Katherine Livins
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
Justin Basilico
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
Owen Zhang
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
Pranav Ainavolu
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth
 
10 books that every developer must read
10 books that every developer must read10 books that every developer must read
10 books that every developer must read
Ganesh Samarthyam
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
Akanksha Bali
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
Vivian S. Zhang
 
Agile Deep Learning
Agile Deep LearningAgile Deep Learning
Agile Deep Learning
David Murgatroyd
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
Yves Raimond
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
Justin Basilico
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
Vivek Raja P S
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
Suresh Arora
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
Jorge Ferrer
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 

What's hot (20)

RecSys 2016 Talk: Feature Selection For Human Recommenders
RecSys 2016 Talk: Feature Selection For Human RecommendersRecSys 2016 Talk: Feature Selection For Human Recommenders
RecSys 2016 Talk: Feature Selection For Human Recommenders
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
10 books that every developer must read
10 books that every developer must read10 books that every developer must read
10 books that every developer must read
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
 
Agile Deep Learning
Agile Deep LearningAgile Deep Learning
Agile Deep Learning
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 

Similar to Strata 2016 - Lessons Learned from building real-life Machine Learning Systems

BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
MLconf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
Daniel Marcous
 
Prototyping Workshop - Wireframes, Mockups, Prototypes
Prototyping Workshop - Wireframes, Mockups, PrototypesPrototyping Workshop - Wireframes, Mockups, Prototypes
Prototyping Workshop - Wireframes, Mockups, Prototypes
Marta Soncodi
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
Recommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixRecommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at Netflix
Jiangwei Pan
 
Effective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PMEffective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PM
Product School
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
Product School
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Nikhil Dandekar
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
Awantik Das
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
Edge AI and Vision Alliance
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
Adam Doyle
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
Roopesh Kohad
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
MLconf
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
ImonBennett
 

Similar to Strata 2016 - Lessons Learned from building real-life Machine Learning Systems (20)

BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Prototyping Workshop - Wireframes, Mockups, Prototypes
Prototyping Workshop - Wireframes, Mockups, PrototypesPrototyping Workshop - Wireframes, Mockups, Prototypes
Prototyping Workshop - Wireframes, Mockups, Prototypes
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Recommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at NetflixRecommendation Modeling with Impression Data at Netflix
Recommendation Modeling with Impression Data at Netflix
 
Effective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PMEffective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PM
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 

More from Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
Xavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
Xavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
Xavier Amatriain
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
Xavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
Xavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
Xavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
Xavier Amatriain
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
Xavier Amatriain
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 

More from Xavier Amatriain (19)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Recently uploaded

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 

Recently uploaded (20)

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 

Strata 2016 - Lessons Learned from building real-life Machine Learning Systems

  • 1. LessonsLearned from building real-life Machine Learning Systems Xavier Amatriain (@xamat) www.quora.com/profile/Xavier-Amatriain 3/29/16
  • 3. Our Mission “To share and grow the world’s knowledge” • Millions of questions & answers • Millions of users • Thousands of topics • ...
  • 4. Demand What we care about Quality Relevance
  • 7. More data or better models? Really? Anand Rajaraman: VC, Founder, Stanford Professor
  • 8. More data or better models? Sometimes, it’s not about more data
  • 9. More data or better models? Norvig: “Google does not have better Algorithms only more Data” Many features/ low-bias models
  • 10. More data or better models? Sometimes, it’s not about more data
  • 12. Better models and features that “don’t work” ● E.g. You have a linear model and have been selecting and optimizing features for that model ■ More complex model with the same features -> improvement not likely ■ More expressive features with the same model -> improvement not likely ● More complex features may require a more complex model ● A more complex model may not show improvements with a feature set that is too simple
  • 14. Hyperparameter optimization ● Automate hyperparameter optimization by choosing the right metric. ○ But, is it as simple as choosing the max? ● Bayesian Optimization (Gaussian Processes) better than grid search ○ See spearmint, hyperopt, AutoML, MOE...
  • 16. Supervised/Unsupervised Learning ● Unsupervised learning as dimensionality reduction ● Unsupervised learning as feature engineering ● The “magic” behind combining unsupervised/supervised learning ○ E.g.1 clustering + knn ○ E.g.2 Matrix Factorization ■ MF can be interpreted as ● Unsupervised: ○ Dimensionality Reduction a la PCA ○ Clustering (e.g. NMF) ● Supervised ○ Labeled targets ~ regression
  • 17. Supervised/Unsupervised Learning ● One of the “tricks” in Deep Learning is how it combines unsupervised/supervised learning ○ E.g. Stacked Autoencoders ○ E.g. training of convolutional nets
  • 19. Ensembles ● Netflix Prize was won by an ensemble ○ Initially Bellkor was using GDBTs ○ BigChaos introduced ANN-based ensemble ● Most practical applications of ML run an ensemble ○ Why wouldn’t you? ○ At least as good as the best of your methods ○ Can add completely different approaches (e. g. CF and content-based) ○ You can use many different models at the ensemble layer: LR, GDBTs, RFs, ANNs...
  • 20. Ensembles & Feature Engineering ● Ensembles are the way to turn any model into a feature! ● E.g. Don’t know if the way to go is to use Factorization Machines, Tensor Factorization, or RNNs? ○ Treat each model as a “feature” ○ Feed them into an ensemble
  • 21. The Master Algorithm? It definitely is the ensemble!
  • 23. Feature Engineering ● Main properties of a well-behaved ML feature ○ Reusable ○ Transformable ○ Interpretable ○ Reliable ● Reusability: You should be able to reuse features in different models, applications, and teams ● Transformability: Besides directly reusing a feature, it should be easy to use a transformation of it (e.g. log(f), max(f), ∑ft over a time window…)
  • 24. Feature Engineering ● Main properties of a well-behaved ML feature ○ Reusable ○ Transformable ○ Interpretable ○ Reliable ● Interpretability: In order to do any of the previous, you need to be able to understand the meaning of features and interpret their values. ● Reliability: It should be easy to monitor and detect bugs/issues in features
  • 25. Feature Engineering Example - Quora Answer Ranking What is a good Quora answer? • truthful • reusable • provides explanation • well formatted • ...
  • 26. Feature Engineering Example - Quora Answer Ranking How are those dimensions translated into features? • Features that relate to the answer quality itself • Interaction features (upvotes/downvotes, clicks, comments…) • User features (e.g. expertise in topic)
  • 28. Implicit vs. Explicit ● Many have acknowledged that implicit feedback is more useful ● Is implicit feedback really always more useful? ● If so, why?
  • 29. ● Implicit data is (usually): ○ More dense, and available for all users ○ Better representative of user behavior vs. user reflection ○ More related to final objective function ○ Better correlated with AB test results ● E.g. Rating vs watching Implicit vs. Explicit
  • 30. ● However ○ It is not always the case that direct implicit feedback correlates well with long-term retention ○ E.g. clickbait ● Solution: ○ Combine different forms of implicit + explicit to better represent long-term goal Implicit vs. Explicit
  • 32. Defining training/testing data ● Training a simple binary classifier for good/bad answer ○ Defining positive and negative labels -> Non-trivial task ○ Is this a positive or a negative? ● funny uninformative answer with many upvotes ● short uninformative answer by a well-known expert in the field ● very long informative answer that nobody reads/upvotes ● informative answer with grammar/spelling mistakes ● ...
  • 33. Other training data issues: Time traveling ● Time traveling: usage of features that originated after the event you are trying to predict ○ E.g. Your upvoting an answer is a pretty good prediction of you reading that answer, especially because most upvotes happen AFTER you read the answer ○ Tricky when you have many related features ○ Whenever I see an offline experiment with huge wins, I ask: “Is there time traveling?”
  • 35. Training a model ● Model will learn according to: ○ Training data (e.g. implicit and explicit) ○ Target function (e.g. probability of user reading an answer) ○ Metric (e.g. precision vs. recall) ● Example 1 (made up): ○ Optimize probability of a user going to the cinema to watch a movie and rate it “highly” by using purchase history and previous ratings. Use NDCG of the ranking as final metric using only movies rated 4 or higher as positives.
  • 36. Example 2 - Quora’s feed ● Training data = implicit + explicit ● Target function: Value of showing a story to a user ~ weighted sum of actions: v = ∑a va 1{ya = 1} ○ predict probabilities for each action, then compute expected value: v_pred = E[ V | x ] = ∑a va p(a | x) ● Metric: any ranking metric
  • 37. Offline testing ● Measure model performance, using (IR) metrics ● Offline performance = indication to make decisions on follow-up A/B tests ● A critical (and mostly unsolved) issue is how offline metrics correlate with A/B test results.
  • 39. 2D Navigational modeling More likely to see Less likely
  • 40. The curse of presentation bias ● User can only click on what you decide to show ● But, what you decide to show is the result of what your model predicted is good ● Simply treating things you show as negatives is not likely to work ● Better options ● Correcting for the probability a user will click on a position -> Attention models ● Explore/exploit approaches such as MAB
  • 42. Distributing ML ● Most of what people do in practice can fit into a multi- core machine ○ Smart data sampling ○ Offline schemes ○ Efficient parallel code ● Dangers of “easy” distributed approaches such as Hadoop/Spark ● Do you care about costs? How about latencies?
  • 43. Distributing ML ● Example of optimizing computations to fit them into one machine ○ Spark implementation: 6 hours, 15 machines ○ Developer time: 4 days ○ C++ implementation: 10 minutes, 1 machine ● Most practical applications of Big Data can fit into a (multicore) implementation
  • 45. Data Scientists and ML Engineers ● We all know the definition of a Data Scientist ● Where do Data Scientists fit in an organization? ○ Many companies struggling with this ● Valuable to have strong DS who can bring value from the data ● Strong DS with solid engineering skills are unicorns and finding them is not scalable ○ DS need engineers to bring things to production ○ Engineers have enough on their plate to be willing to “productionize” cool DS projects
  • 46. The data-driven ML innovation funnel Data Research ML Exploration - Product Design AB Testing
  • 47. Data Scientists and ML Engineers ● Solution: ○ (1) Define different parts of the innovation funnel ■ Part 1. Data research & hypothesis building -> Data Science ■ Part 2. ML solution building & implementation -> ML Engineering ■ Part 3. Online experimentation, AB Testing analysis-> Data Science ○ (2) Broaden the definition of ML Engineers to include from coding experts with high-level ML knowledge to ML experts with good software skills Data Research ML Solution AB Testing Data Science Data Science ML Engineering
  • 49. ● In data, size is not all that matters ● Understand dependencies between data, models & systems ● Choose the right metric & optimize what matters ● Be thoughtful about ○ your ML infrastructure/tools ○ about organizing your teams