SlideShare a Scribd company logo
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)
A bit about myself...
A bit about myself...
● PhD on Audio and Music Signal Processing and Modeling
● Researcher in Recommender Systems for several years
● Led ML Research/Engineering at Netflix
● VP of Engineering at Quora
● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to
everyone)
A bit about Curai...
What are we doing?
● Mission: Provide the world's
best healthcare for everyone
● Product: User-facing mobile
primary care app
● Team: Building an awesome
and diverse team
● Approach: State-of-the-art
AI/ML + product/UX/clinical
AI-based interaction
AI + Health coaches
AI + Doctors
Peer-reviewed research at Curai
Lessons learned...
More Data
or
Better Data?
Lesson 1
More data or better models?
Really?
More data or better models?
Sometimes,
it’s not about
more data
More data or better models?
Norvig:
“Google does not have
better Algorithms only
more Data”
Many
features/
low-bias
models
More data or better models?
Sometimes
you might not
need all your
“Big Data”
0 2 4 6 8 10 12 14 16 18 20
Number of Training Examples (in Millions)
TestingAccuracy
What about Deep Learning?
Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal)
1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other
texts (1991)
Hidden Markov Model (1984)
1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The
Extended Book” (1991)
Negascout planning algorithm (1983)
2005 Google’s Arabic- and Chinese-to-English
translation
1,8 trillion tokens from Google Web and News
pages (collected in 2005)
Statistical machine translation algorithm (1988)
2011 IBM watson become the world Jeopardy!
Champion
8,6 million documents from Wikipedia,
Wiktionary, Wikiquote, and Project Gutenberg
(updated in 2005)
Mixture-of-Experts algorithm (1991)
2014 Google’s GoogLeNet object classification at
near-human performance
ImageNet corpus of 1,5 million labeled images
and 1,000 object catagories (2010)
Convolution neural network algorithm (1989)
2015 Google’s Deepmind achieved human parity in
playing 29 Atari games by learning general
control from video
Arcade Learning Environment dataset of over
50 Atari games (2013)
Q-learning algorithm (1992)
Average No. Of Years to Breakthrough 3 years 18 years
The average elapsed time between key algorithm proposals and corresponding advances was about 18 years,
whereas the average elapsed time between key dataset availabilities and corresponding advances was less
than 3 years, or about 6 times faster.
What about Deep Learning?
Models and
Recipes
Pretrained
Available models trained using OpenNMT
→ English → German
→ German → English
→ English Summarization
→ Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO
More models coming soon:
→ Ubuntu Dialog Dataset
→ Syntactic Parsing
→ Image-to-Text
More data and better data
Simple
Models
>>
Complex
Models
Lesson 2
Football or Futbol?
Occam’s razor
Given two models that perform
more or less equally, you should
always prefer the less complex
Deep Learning might not be
preferred, even if it squeezes a
+1% in accuracy
Reasons to prefer a simpler model
Reasons to prefer a simpler model
….
There are many others
System complexity
Maintenance
Explainability
….
Figure 3: GoogLeNet network with all the bells and whistles
A real-life example
Goal: Supervised
Classification
→ 40 features
→ 10k examples
What did the ML
Engineer choose?
→ Multi-layer ANN trained
with Tensor Flow
What was his proposed
next step?
→ Try ConvNets
Where is the problem?
→ Hours to train, already
looking into distributing
→ There are much simpler
approaches
.... But,
sometimes
you do need
a Complex
Model
Lesson 3
Better models and features that “don’t work”
E.g. You have a linear model and have been
selecting and optimizing features for that model
→ More complex model with the same features -> improvement
not likely
→ More expressive features -> improvement not likely
More complex features may require a more
complex model
A more complex model may not show
improvements with a feature set that is too
simple
Yes, you
should care
about
Feature
Engineering
Lesson 4
Feature Engineering Example - Answer Ranking
How are those dimensions
translated into features?
Features that relate to the answer
Quality itself
Interaction features (upvotes/downvotes,
clicks, comments…)
User features (e.g. expertise in topic)
What is a good Quora answer?
Truthful Reusable Provides
explanation
Well
formatted ...
Feature Engineering
Properties of a well-
behaved ML feature Output
Mapping
from
features
OutputOutput
Most
complex
features
Mapping
from
features
Mapping
from
features
Output
Simplest
features
Features
Hand –
designed
features
Hand –
designed
program
InputInputInputInput
Rule -
based
systems
Classic
machine
learning
Representation
learning
Deep
learning
Fig; I. Goodfellow
Deep Learning:
Automating
Feature Discovery
Interpretable
Reliable
Reusable
Transformable
Deep Learning & Feature Engineering
Deep Learning & Feature Architecture Engineering
Supervised
vs.
Unsupervised
Learning
Lesson 5
Supervised/Unsupervised Learning
Unsupervised learning as
dimensionality reduction
E.g.1
Clustering + knn
E.g.2
Matrix Factorization
MF can be
interpreted as
Unsupervised:
• Dimensionality Reduction a la PCA
• Clustering (e.g. NMF)
Supervised:
• Labeled targets ~ regression
Unsupervised learning
as feature engineering
The “magic” behind combining unsupervised/supervised learning
Supervised/Unsupervised Learning
One of the “tricks” in
Deep Learning is how it
combines unsupervised
/supervised learning
→ E.g. Stacked Autoencoders
→ E.g. training of convolutional nets
X1
X2
X3
X4
X5
X6
+1
+1
+1
...
...
...
Input Features I Features II Softmax
classifier
P(y=0 | x)
P(y=1 | x)
P(y=2 | x)
Stacked
Autoencoders
Input
83x83
Layer 1
64x75x75
Layer 2
64@14x14
Layer 3
256@6x6
Layer 4
256@1x1
Output4
101
9x9
Convolution
(64 kernels)
10x10 pooling
5x5 subsampling
9x9
Convolution
(4096 kernels)
6x6 pooling
4x4 subsamp
→ Non-Linearity: half-wave rectification, shrinkage function, sigmoid
→ Pooling: average, L1, L2, max
→ Training: Supervised (1988-2006), Unsupervised+supervised (2006-now)
Convolutional Network (CovNet)
Neural Networks
Supervised
Unsupervised
Superviseed
Boost
ing
SVM
Decis
ion
Tree
Perc
eptro
n
AE D-AE
Neur
al
Net
RNN
Conv
. Net
RBM Spar
se
Codi
ng
DBN DBM
GMM Baye
s NP
ΣΠ
Supervised/Unsupervised Self-supervised Learning
Self-supervision
→ E.g. BERT and other LM
Everything is
an ensembleLesson 6
Ensembles
Netflix Prize was won by an ensemble
Most practical applications of ML run
an ensemble
→ Initially Bellkor was using GDBTs
→ BigChaos introduced ANN-based ensemble
→ Why wouldn’t you?
→ At least as good as the best of your methods
→ Can add completely different approaches (e.g. CF
and content-based)
→ You can use many different models at the ensemble
layer: LR, GDBTs, RFs, ANNs...
Ensembles & Feature Engineering
Ensembles are
the way to turn
any model into a
feature!
E.g. Don’t know if the
way to go is to use
Factorization Machines,
Tensor Factorization, or
RNNs?
→ Treat each model as a
“feature”
→ Feed them into an
ensemble
Sigmoid
Rectified
Linear Units
Output Units
Hidden Layers
Dense
Embeddings
Sparse
Features
Wide Models Deep Models Wide & Deep Models
There are
biases in your
data
Lesson 7
Defining training/testing data
Training a simple binary classifier for
good/bad answer
→ Defining positive and negative labels ->
Non-trivial task
→ Is this a positive or a negative?
→ funny uninformative answer with many
upvotes
→ short uninformative answer by a
well-known expert in the field
→ very long informative answer that nobody
reads/upvotes
→ informative answer with
grammar/spelling mistakes
→ ...
The curse of presentation bias
Better options
→ Correcting for the probability
a user will click on a position
-> Attention models
→ Explore/exploit approaches
such as MAB
Simply treating things you
show as negatives is not likely
to work
User can only click on what
you decide to show
→ But, what you decide to
show is the result of what
your model predicted is good
More
likely
to see
Less
likely
Bias & Fairness
Think about your
models
“in the wild”
Lesson 8
AI in the wild: Desired properties
● Easily extensible
○ Incrementally/iteratively learn from
“human-in-the-loop” or from
additional data
● Knows what it does not know
○ Models uncertainty in prediction
○ Enables fall-back to manual
Assisted diagnosis in the wild
1. Extensibility
a. Diagnosis as a ML task
i. Expert systems as a prior
b. Modeling less prevalent diseases
i. Low-shot learning
2. Knowing what you don’t know
b. Measures of uncertainty in
prediction
c. Allows fall-back to
“physician-in-the-loop”
Data and Models are great.
You know what’s even better?
The right
evaluation
approach!
Lesson 9
Offline/Online testing process
Offline Experimentation Online Experimentation
Initial
Hypothesis
Design AB
Test
Choose Control
Deploy Prototype
Observe Behavior
Analyze Results
Significant
Improvements?
Choose Model
Train Model
Test Offline
Hypothesis
Validated?
Try different
Model?
Reformulated
Hypothesis
Deploy
Feature
NO
YES
NO YES
NO
YES
Executing A/B tests
Overall Evaluation Criteria (OEC) =
e.g. member retention at Netflix
→ Use long-term metrics
whenever possible
→ Short-term metrics can be
informative and allow faster
decisions
⁻ But, not always aligned with
OEC
Measure differences
in metrics across
statistically identical
populations that
each experience a
different algorithm.
Decisions on the product always
data-driven
Offline testing
Measure model
performance, using (IR)
metrics
Offline performance =
indication to make decisions
on follow-up A/B tests
A critical (and mostly
unsolved) issue is how
offline metrics correlate with
A/B test results.
Do not
underestimate
the value of
systems and
frameworks
Lesson 10
ML vs Software
Can you treat your ML infrastructure as you
would your software one?
→ Yes and No
You should apply best Software Engineering
practices (e.g. encapsulation, abstraction,
cohesion, low coupling…)
However, Design Patterns for Machine Learning
software are not well known/documented
Software: the new frontier of ML?
Your AI
infrastructure
will have two
masters
Lesson 11
Machine Learning Infrastructure
→ Whenever you develop any ML infrastructure, you need to target two different modes:
Mode 1: ML experimentation
− Flexibility
− Easy-to-use
− Reusability
Mode 2: ML production
− All of the above + performance & scalability
→ Ideally you want the two modes to be as similar as possible
→ How to combine them?
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing once something shows
results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure out how to run
experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing
once something shows results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure
out how to run experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
Good
intermediate
options
→ Have ML “researchers” experiment on Jupyter Notebooks using
Python tools (scikit-learn, Pytorch, TF…). Use same tools in
production whenever possible, implement optimized versions only
when needed.
→ Implement abstraction layers on top of optimized implementations
so they can be accessed from regular/friendly experimentation tools
There is ML
beyond Deep
Learning
Lesson 12
Other ML Advances
● Factorization Machines
● Tensor Methods
● Non-parametric Bayesian models
● XGBoost
● Online Learning
● Reinforcement Learning
● Learning to rank
● ...
Other very successful approaches
Sometimes DL does not win
Conclusions
01.
02.
03.
04.
05.
Choose the right metric
Be thoughtful about your data
Understand dependencies between data, models & systems
Optimize only what matters, beware of biases
Be thoughtful about : Your ML infrastructure/tools,
About organizing your teams
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)

More Related Content

What's hot

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
Amit Sharma
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
Enrico Palumbo
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
Justin Basilico
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
HyunJoon Jung
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
ConnorShorten2
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
Qi Guo
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
Erik Bernhardsson
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
Bhaskar Mitra
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
ChaoYang81
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
Yves Raimond
 
Large Language Models Are Reasoning Teachers
Large Language Models Are Reasoning TeachersLarge Language Models Are Reasoning Teachers
Large Language Models Are Reasoning Teachers
Namgyu Ho
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
David Rostcheck
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
Daniel Tunkelang
 

What's hot (20)

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
Talent Search and Recommendation Systems at LinkedIn: Practical Challenges an...
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Large Language Models Are Reasoning Teachers
Large Language Models Are Reasoning TeachersLarge Language Models Are Reasoning Teachers
Large Language Models Are Reasoning Teachers
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 

Similar to Lessons learned from building practical deep learning systems

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 
ML crash course
ML crash courseML crash course
ML crash course
mikaelhuss
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
CCG
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
Greg Werner
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
JoanJeremiah
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
台灣資料科學年會
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
Johnson Ubah
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdf
Elio Laureano
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_Introduction
Elio Laureano
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
Greg Werner
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Wuhan University
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 

Similar to Lessons learned from building practical deep learning systems (20)

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
ML crash course
ML crash courseML crash course
ML crash course
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdf
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_Introduction
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 

More from Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
Xavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
Xavier Amatriain
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
Xavier Amatriain
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
Xavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
Xavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 

More from Xavier Amatriain (20)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Lessons learned from building practical deep learning systems

  • 1. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)
  • 2. A bit about myself...
  • 3. A bit about myself... ● PhD on Audio and Music Signal Processing and Modeling ● Researcher in Recommender Systems for several years ● Led ML Research/Engineering at Netflix ● VP of Engineering at Quora ● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to everyone)
  • 4. A bit about Curai...
  • 5. What are we doing? ● Mission: Provide the world's best healthcare for everyone ● Product: User-facing mobile primary care app ● Team: Building an awesome and diverse team ● Approach: State-of-the-art AI/ML + product/UX/clinical AI-based interaction AI + Health coaches AI + Doctors
  • 9. More data or better models? Really?
  • 10. More data or better models? Sometimes, it’s not about more data
  • 11. More data or better models? Norvig: “Google does not have better Algorithms only more Data” Many features/ low-bias models
  • 12. More data or better models? Sometimes you might not need all your “Big Data” 0 2 4 6 8 10 12 14 16 18 20 Number of Training Examples (in Millions) TestingAccuracy
  • 13. What about Deep Learning? Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal) 1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other texts (1991) Hidden Markov Model (1984) 1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The Extended Book” (1991) Negascout planning algorithm (1983) 2005 Google’s Arabic- and Chinese-to-English translation 1,8 trillion tokens from Google Web and News pages (collected in 2005) Statistical machine translation algorithm (1988) 2011 IBM watson become the world Jeopardy! Champion 8,6 million documents from Wikipedia, Wiktionary, Wikiquote, and Project Gutenberg (updated in 2005) Mixture-of-Experts algorithm (1991) 2014 Google’s GoogLeNet object classification at near-human performance ImageNet corpus of 1,5 million labeled images and 1,000 object catagories (2010) Convolution neural network algorithm (1989) 2015 Google’s Deepmind achieved human parity in playing 29 Atari games by learning general control from video Arcade Learning Environment dataset of over 50 Atari games (2013) Q-learning algorithm (1992) Average No. Of Years to Breakthrough 3 years 18 years The average elapsed time between key algorithm proposals and corresponding advances was about 18 years, whereas the average elapsed time between key dataset availabilities and corresponding advances was less than 3 years, or about 6 times faster.
  • 14. What about Deep Learning? Models and Recipes Pretrained Available models trained using OpenNMT → English → German → German → English → English Summarization → Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO More models coming soon: → Ubuntu Dialog Dataset → Syntactic Parsing → Image-to-Text
  • 15. More data and better data
  • 18. Occam’s razor Given two models that perform more or less equally, you should always prefer the less complex Deep Learning might not be preferred, even if it squeezes a +1% in accuracy
  • 19. Reasons to prefer a simpler model
  • 20. Reasons to prefer a simpler model …. There are many others System complexity Maintenance Explainability …. Figure 3: GoogLeNet network with all the bells and whistles
  • 21. A real-life example Goal: Supervised Classification → 40 features → 10k examples What did the ML Engineer choose? → Multi-layer ANN trained with Tensor Flow What was his proposed next step? → Try ConvNets Where is the problem? → Hours to train, already looking into distributing → There are much simpler approaches
  • 22. .... But, sometimes you do need a Complex Model Lesson 3
  • 23. Better models and features that “don’t work” E.g. You have a linear model and have been selecting and optimizing features for that model → More complex model with the same features -> improvement not likely → More expressive features -> improvement not likely More complex features may require a more complex model A more complex model may not show improvements with a feature set that is too simple
  • 25. Feature Engineering Example - Answer Ranking How are those dimensions translated into features? Features that relate to the answer Quality itself Interaction features (upvotes/downvotes, clicks, comments…) User features (e.g. expertise in topic) What is a good Quora answer? Truthful Reusable Provides explanation Well formatted ...
  • 26. Feature Engineering Properties of a well- behaved ML feature Output Mapping from features OutputOutput Most complex features Mapping from features Mapping from features Output Simplest features Features Hand – designed features Hand – designed program InputInputInputInput Rule - based systems Classic machine learning Representation learning Deep learning Fig; I. Goodfellow Deep Learning: Automating Feature Discovery Interpretable Reliable Reusable Transformable
  • 27. Deep Learning & Feature Engineering
  • 28. Deep Learning & Feature Architecture Engineering
  • 30. Supervised/Unsupervised Learning Unsupervised learning as dimensionality reduction E.g.1 Clustering + knn E.g.2 Matrix Factorization MF can be interpreted as Unsupervised: • Dimensionality Reduction a la PCA • Clustering (e.g. NMF) Supervised: • Labeled targets ~ regression Unsupervised learning as feature engineering The “magic” behind combining unsupervised/supervised learning
  • 31. Supervised/Unsupervised Learning One of the “tricks” in Deep Learning is how it combines unsupervised /supervised learning → E.g. Stacked Autoencoders → E.g. training of convolutional nets X1 X2 X3 X4 X5 X6 +1 +1 +1 ... ... ... Input Features I Features II Softmax classifier P(y=0 | x) P(y=1 | x) P(y=2 | x) Stacked Autoencoders Input 83x83 Layer 1 64x75x75 Layer 2 64@14x14 Layer 3 256@6x6 Layer 4 256@1x1 Output4 101 9x9 Convolution (64 kernels) 10x10 pooling 5x5 subsampling 9x9 Convolution (4096 kernels) 6x6 pooling 4x4 subsamp → Non-Linearity: half-wave rectification, shrinkage function, sigmoid → Pooling: average, L1, L2, max → Training: Supervised (1988-2006), Unsupervised+supervised (2006-now) Convolutional Network (CovNet) Neural Networks Supervised Unsupervised Superviseed Boost ing SVM Decis ion Tree Perc eptro n AE D-AE Neur al Net RNN Conv . Net RBM Spar se Codi ng DBN DBM GMM Baye s NP ΣΠ
  • 34. Ensembles Netflix Prize was won by an ensemble Most practical applications of ML run an ensemble → Initially Bellkor was using GDBTs → BigChaos introduced ANN-based ensemble → Why wouldn’t you? → At least as good as the best of your methods → Can add completely different approaches (e.g. CF and content-based) → You can use many different models at the ensemble layer: LR, GDBTs, RFs, ANNs...
  • 35. Ensembles & Feature Engineering Ensembles are the way to turn any model into a feature! E.g. Don’t know if the way to go is to use Factorization Machines, Tensor Factorization, or RNNs? → Treat each model as a “feature” → Feed them into an ensemble Sigmoid Rectified Linear Units Output Units Hidden Layers Dense Embeddings Sparse Features Wide Models Deep Models Wide & Deep Models
  • 36. There are biases in your data Lesson 7
  • 37. Defining training/testing data Training a simple binary classifier for good/bad answer → Defining positive and negative labels -> Non-trivial task → Is this a positive or a negative? → funny uninformative answer with many upvotes → short uninformative answer by a well-known expert in the field → very long informative answer that nobody reads/upvotes → informative answer with grammar/spelling mistakes → ...
  • 38. The curse of presentation bias Better options → Correcting for the probability a user will click on a position -> Attention models → Explore/exploit approaches such as MAB Simply treating things you show as negatives is not likely to work User can only click on what you decide to show → But, what you decide to show is the result of what your model predicted is good More likely to see Less likely
  • 40. Think about your models “in the wild” Lesson 8
  • 41. AI in the wild: Desired properties ● Easily extensible ○ Incrementally/iteratively learn from “human-in-the-loop” or from additional data ● Knows what it does not know ○ Models uncertainty in prediction ○ Enables fall-back to manual
  • 42. Assisted diagnosis in the wild 1. Extensibility a. Diagnosis as a ML task i. Expert systems as a prior b. Modeling less prevalent diseases i. Low-shot learning 2. Knowing what you don’t know b. Measures of uncertainty in prediction c. Allows fall-back to “physician-in-the-loop”
  • 43. Data and Models are great. You know what’s even better? The right evaluation approach! Lesson 9
  • 44. Offline/Online testing process Offline Experimentation Online Experimentation Initial Hypothesis Design AB Test Choose Control Deploy Prototype Observe Behavior Analyze Results Significant Improvements? Choose Model Train Model Test Offline Hypothesis Validated? Try different Model? Reformulated Hypothesis Deploy Feature NO YES NO YES NO YES
  • 45. Executing A/B tests Overall Evaluation Criteria (OEC) = e.g. member retention at Netflix → Use long-term metrics whenever possible → Short-term metrics can be informative and allow faster decisions ⁻ But, not always aligned with OEC Measure differences in metrics across statistically identical populations that each experience a different algorithm. Decisions on the product always data-driven
  • 46. Offline testing Measure model performance, using (IR) metrics Offline performance = indication to make decisions on follow-up A/B tests A critical (and mostly unsolved) issue is how offline metrics correlate with A/B test results.
  • 47. Do not underestimate the value of systems and frameworks Lesson 10
  • 48. ML vs Software Can you treat your ML infrastructure as you would your software one? → Yes and No You should apply best Software Engineering practices (e.g. encapsulation, abstraction, cohesion, low coupling…) However, Design Patterns for Machine Learning software are not well known/documented
  • 49. Software: the new frontier of ML?
  • 50. Your AI infrastructure will have two masters Lesson 11
  • 51. Machine Learning Infrastructure → Whenever you develop any ML infrastructure, you need to target two different modes: Mode 1: ML experimentation − Flexibility − Easy-to-use − Reusability Mode 2: ML production − All of the above + performance & scalability → Ideally you want the two modes to be as similar as possible → How to combine them?
  • 52. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 53. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 54. Machine Learning Infrastructure Good intermediate options → Have ML “researchers” experiment on Jupyter Notebooks using Python tools (scikit-learn, Pytorch, TF…). Use same tools in production whenever possible, implement optimized versions only when needed. → Implement abstraction layers on top of optimized implementations so they can be accessed from regular/friendly experimentation tools
  • 55. There is ML beyond Deep Learning Lesson 12
  • 56. Other ML Advances ● Factorization Machines ● Tensor Methods ● Non-parametric Bayesian models ● XGBoost ● Online Learning ● Reinforcement Learning ● Learning to rank ● ...
  • 57. Other very successful approaches
  • 58. Sometimes DL does not win
  • 60. 01. 02. 03. 04. 05. Choose the right metric Be thoughtful about your data Understand dependencies between data, models & systems Optimize only what matters, beware of biases Be thoughtful about : Your ML infrastructure/tools, About organizing your teams
  • 61. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)