SlideShare a Scribd company logo
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)
A bit about myself...
A bit about myself...
● PhD on Audio and Music Signal Processing and Modeling
● Researcher in Recommender Systems for several years
● Led ML Research/Engineering at Netflix
● VP of Engineering at Quora
● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to
everyone)
A bit about Curai...
What are we doing?
● Mission: Provide the world's
best healthcare for everyone
● Product: User-facing mobile
primary care app
● Team: Building an awesome
and diverse team
● Approach: State-of-the-art
AI/ML + product/UX/clinical
AI-based interaction
AI + Health coaches
AI + Doctors
Peer-reviewed research at Curai
Lessons learned...
More Data
or
Better Data?
Lesson 1
More data or better models?
Really?
More data or better models?
Sometimes,
it’s not about
more data
More data or better models?
Norvig:
“Google does not have
better Algorithms only
more Data”
Many
features/
low-bias
models
More data or better models?
Sometimes
you might not
need all your
“Big Data”
0 2 4 6 8 10 12 14 16 18 20
Number of Training Examples (in Millions)
TestingAccuracy
What about Deep Learning?
Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal)
1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other
texts (1991)
Hidden Markov Model (1984)
1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The
Extended Book” (1991)
Negascout planning algorithm (1983)
2005 Google’s Arabic- and Chinese-to-English
translation
1,8 trillion tokens from Google Web and News
pages (collected in 2005)
Statistical machine translation algorithm (1988)
2011 IBM watson become the world Jeopardy!
Champion
8,6 million documents from Wikipedia,
Wiktionary, Wikiquote, and Project Gutenberg
(updated in 2005)
Mixture-of-Experts algorithm (1991)
2014 Google’s GoogLeNet object classification at
near-human performance
ImageNet corpus of 1,5 million labeled images
and 1,000 object catagories (2010)
Convolution neural network algorithm (1989)
2015 Google’s Deepmind achieved human parity in
playing 29 Atari games by learning general
control from video
Arcade Learning Environment dataset of over
50 Atari games (2013)
Q-learning algorithm (1992)
Average No. Of Years to Breakthrough 3 years 18 years
The average elapsed time between key algorithm proposals and corresponding advances was about 18 years,
whereas the average elapsed time between key dataset availabilities and corresponding advances was less
than 3 years, or about 6 times faster.
What about Deep Learning?
Models and
Recipes
Pretrained
Available models trained using OpenNMT
→ English → German
→ German → English
→ English Summarization
→ Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO
More models coming soon:
→ Ubuntu Dialog Dataset
→ Syntactic Parsing
→ Image-to-Text
More data and better data
Simple
Models
>>
Complex
Models
Lesson 2
Football or Futbol?
Occam’s razor
Given two models that perform
more or less equally, you should
always prefer the less complex
Deep Learning might not be
preferred, even if it squeezes a
+1% in accuracy
Reasons to prefer a simpler model
Reasons to prefer a simpler model
….
There are many others
System complexity
Maintenance
Explainability
….
Figure 3: GoogLeNet network with all the bells and whistles
A real-life example
Goal: Supervised
Classification
→ 40 features
→ 10k examples
What did the ML
Engineer choose?
→ Multi-layer ANN trained
with Tensor Flow
What was his proposed
next step?
→ Try ConvNets
Where is the problem?
→ Hours to train, already
looking into distributing
→ There are much simpler
approaches
.... But,
sometimes
you do need
a Complex
Model
Lesson 3
Better models and features that “don’t work”
E.g. You have a linear model and have been
selecting and optimizing features for that model
→ More complex model with the same features -> improvement
not likely
→ More expressive features -> improvement not likely
More complex features may require a more
complex model
A more complex model may not show
improvements with a feature set that is too
simple
Yes, you
should care
about
Feature
Engineering
Lesson 4
Feature Engineering Example - Answer Ranking
How are those dimensions
translated into features?
Features that relate to the answer
Quality itself
Interaction features (upvotes/downvotes,
clicks, comments…)
User features (e.g. expertise in topic)
What is a good Quora answer?
Truthful Reusable Provides
explanation
Well
formatted ...
Feature Engineering
Properties of a well-
behaved ML feature Output
Mapping
from
features
OutputOutput
Most
complex
features
Mapping
from
features
Mapping
from
features
Output
Simplest
features
Features
Hand –
designed
features
Hand –
designed
program
InputInputInputInput
Rule -
based
systems
Classic
machine
learning
Representation
learning
Deep
learning
Fig; I. Goodfellow
Deep Learning:
Automating
Feature Discovery
Interpretable
Reliable
Reusable
Transformable
Deep Learning & Feature Engineering
Deep Learning & Feature Architecture Engineering
Supervised
vs.
Unsupervised
Learning
Lesson 5
Supervised/Unsupervised Learning
Unsupervised learning as
dimensionality reduction
E.g.1
Clustering + knn
E.g.2
Matrix Factorization
MF can be
interpreted as
Unsupervised:
• Dimensionality Reduction a la PCA
• Clustering (e.g. NMF)
Supervised:
• Labeled targets ~ regression
Unsupervised learning
as feature engineering
The “magic” behind combining unsupervised/supervised learning
Supervised/Unsupervised Learning
One of the “tricks” in
Deep Learning is how it
combines unsupervised
/supervised learning
→ E.g. Stacked Autoencoders
→ E.g. training of convolutional nets
X1
X2
X3
X4
X5
X6
+1
+1
+1
...
...
...
Input Features I Features II Softmax
classifier
P(y=0 | x)
P(y=1 | x)
P(y=2 | x)
Stacked
Autoencoders
Input
83x83
Layer 1
64x75x75
Layer 2
64@14x14
Layer 3
256@6x6
Layer 4
256@1x1
Output4
101
9x9
Convolution
(64 kernels)
10x10 pooling
5x5 subsampling
9x9
Convolution
(4096 kernels)
6x6 pooling
4x4 subsamp
→ Non-Linearity: half-wave rectification, shrinkage function, sigmoid
→ Pooling: average, L1, L2, max
→ Training: Supervised (1988-2006), Unsupervised+supervised (2006-now)
Convolutional Network (CovNet)
Neural Networks
Supervised
Unsupervised
Superviseed
Boost
ing
SVM
Decis
ion
Tree
Perc
eptro
n
AE D-AE
Neur
al
Net
RNN
Conv
. Net
RBM Spar
se
Codi
ng
DBN DBM
GMM Baye
s NP
ΣΠ
Supervised/Unsupervised Self-supervised Learning
Self-supervision
→ E.g. BERT and other LM
Everything is
an ensembleLesson 6
Ensembles
Netflix Prize was won by an ensemble
Most practical applications of ML run
an ensemble
→ Initially Bellkor was using GDBTs
→ BigChaos introduced ANN-based ensemble
→ Why wouldn’t you?
→ At least as good as the best of your methods
→ Can add completely different approaches (e.g. CF
and content-based)
→ You can use many different models at the ensemble
layer: LR, GDBTs, RFs, ANNs...
Ensembles & Feature Engineering
Ensembles are
the way to turn
any model into a
feature!
E.g. Don’t know if the
way to go is to use
Factorization Machines,
Tensor Factorization, or
RNNs?
→ Treat each model as a
“feature”
→ Feed them into an
ensemble
Sigmoid
Rectified
Linear Units
Output Units
Hidden Layers
Dense
Embeddings
Sparse
Features
Wide Models Deep Models Wide & Deep Models
There are
biases in your
data
Lesson 7
Defining training/testing data
Training a simple binary classifier for
good/bad answer
→ Defining positive and negative labels ->
Non-trivial task
→ Is this a positive or a negative?
→ funny uninformative answer with many
upvotes
→ short uninformative answer by a
well-known expert in the field
→ very long informative answer that nobody
reads/upvotes
→ informative answer with
grammar/spelling mistakes
→ ...
The curse of presentation bias
Better options
→ Correcting for the probability
a user will click on a position
-> Attention models
→ Explore/exploit approaches
such as MAB
Simply treating things you
show as negatives is not likely
to work
User can only click on what
you decide to show
→ But, what you decide to
show is the result of what
your model predicted is good
More
likely
to see
Less
likely
Bias & Fairness
Think about your
models
“in the wild”
Lesson 8
AI in the wild: Desired properties
● Easily extensible
○ Incrementally/iteratively learn from
“human-in-the-loop” or from
additional data
● Knows what it does not know
○ Models uncertainty in prediction
○ Enables fall-back to manual
Assisted diagnosis in the wild
1. Extensibility
a. Diagnosis as a ML task
i. Expert systems as a prior
b. Modeling less prevalent diseases
i. Low-shot learning
2. Knowing what you don’t know
b. Measures of uncertainty in
prediction
c. Allows fall-back to
“physician-in-the-loop”
Data and Models are great.
You know what’s even better?
The right
evaluation
approach!
Lesson 9
Offline/Online testing process
Offline Experimentation Online Experimentation
Initial
Hypothesis
Design AB
Test
Choose Control
Deploy Prototype
Observe Behavior
Analyze Results
Significant
Improvements?
Choose Model
Train Model
Test Offline
Hypothesis
Validated?
Try different
Model?
Reformulated
Hypothesis
Deploy
Feature
NO
YES
NO YES
NO
YES
Executing A/B tests
Overall Evaluation Criteria (OEC) =
e.g. member retention at Netflix
→ Use long-term metrics
whenever possible
→ Short-term metrics can be
informative and allow faster
decisions
⁻ But, not always aligned with
OEC
Measure differences
in metrics across
statistically identical
populations that
each experience a
different algorithm.
Decisions on the product always
data-driven
Offline testing
Measure model
performance, using (IR)
metrics
Offline performance =
indication to make decisions
on follow-up A/B tests
A critical (and mostly
unsolved) issue is how
offline metrics correlate with
A/B test results.
Do not
underestimate
the value of
systems and
frameworks
Lesson 10
ML vs Software
Can you treat your ML infrastructure as you
would your software one?
→ Yes and No
You should apply best Software Engineering
practices (e.g. encapsulation, abstraction,
cohesion, low coupling…)
However, Design Patterns for Machine Learning
software are not well known/documented
Software: the new frontier of ML?
Your AI
infrastructure
will have two
masters
Lesson 11
Machine Learning Infrastructure
→ Whenever you develop any ML infrastructure, you need to target two different modes:
Mode 1: ML experimentation
− Flexibility
− Easy-to-use
− Reusability
Mode 2: ML production
− All of the above + performance & scalability
→ Ideally you want the two modes to be as similar as possible
→ How to combine them?
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing once something shows
results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure out how to run
experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing
once something shows results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure
out how to run experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
Good
intermediate
options
→ Have ML “researchers” experiment on Jupyter Notebooks using
Python tools (scikit-learn, Pytorch, TF…). Use same tools in
production whenever possible, implement optimized versions only
when needed.
→ Implement abstraction layers on top of optimized implementations
so they can be accessed from regular/friendly experimentation tools
There is ML
beyond Deep
Learning
Lesson 12
Other ML Advances
● Factorization Machines
● Tensor Methods
● Non-parametric Bayesian models
● XGBoost
● Online Learning
● Reinforcement Learning
● Learning to rank
● ...
Other very successful approaches
Sometimes DL does not win
Conclusions
01.
02.
03.
04.
05.
Choose the right metric
Be thoughtful about your data
Understand dependencies between data, models & systems
Optimize only what matters, beware of biases
Be thoughtful about : Your ML infrastructure/tools,
About organizing your teams
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)

More Related Content

What's hot

And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
Leon Dohmen
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
Hayim Makabee
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Arithmer Inc.
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
How will development change with LLMs
How will development change with LLMsHow will development change with LLMs
How will development change with LLMs
Microsoft, InfuseAI, Appier, IBM, KaiOS
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
ZOLLHOF - Tech Incubator
 
Regulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with TransparencyRegulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with Transparency
Debmalya Biswas
 
Session-Based Recommender Systems
Session-Based Recommender SystemsSession-Based Recommender Systems
Session-Based Recommender Systems
Eötvös Loránd University
 
Data engineering zoomcamp introduction
Data engineering zoomcamp  introductionData engineering zoomcamp  introduction
Data engineering zoomcamp introduction
Alexey Grigorev
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
Faisal Siddiqi
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
Eugene Yan Ziyou
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Wagston Staehler
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
Anoop Deoras
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
Neo4j
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach
Neo4j
 
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
DianaGray10
 
Pre trained language model
Pre trained language modelPre trained language model
Pre trained language model
JiWenKim
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
Lars Albertsson
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 

What's hot (20)

And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
How will development change with LLMs
How will development change with LLMsHow will development change with LLMs
How will development change with LLMs
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Regulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with TransparencyRegulating Generative AI - LLMOps pipelines with Transparency
Regulating Generative AI - LLMOps pipelines with Transparency
 
Session-Based Recommender Systems
Session-Based Recommender SystemsSession-Based Recommender Systems
Session-Based Recommender Systems
 
Data engineering zoomcamp introduction
Data engineering zoomcamp  introductionData engineering zoomcamp  introduction
Data engineering zoomcamp introduction
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
GPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge GraphGPT and Graph Data Science to power your Knowledge Graph
GPT and Graph Data Science to power your Knowledge Graph
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach
 
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
 
Pre trained language model
Pre trained language modelPre trained language model
Pre trained language model
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 

Similar to Lessons learned from building practical deep learning systems

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 
ML crash course
ML crash courseML crash course
ML crash course
mikaelhuss
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
CCG
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
Greg Werner
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
JoanJeremiah
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
台灣資料科學年會
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
Johnson Ubah
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdf
Elio Laureano
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_Introduction
Elio Laureano
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
Greg Werner
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Wuhan University
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
Benjamin Schulte
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 

Similar to Lessons learned from building practical deep learning systems (20)

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
ML crash course
ML crash courseML crash course
ML crash course
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdf
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_Introduction
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 

More from Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
Xavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
Xavier Amatriain
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
Xavier Amatriain
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
Xavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
Xavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 

More from Xavier Amatriain (20)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 

Recently uploaded

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

Lessons learned from building practical deep learning systems

  • 1. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)
  • 2. A bit about myself...
  • 3. A bit about myself... ● PhD on Audio and Music Signal Processing and Modeling ● Researcher in Recommender Systems for several years ● Led ML Research/Engineering at Netflix ● VP of Engineering at Quora ● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to everyone)
  • 4. A bit about Curai...
  • 5. What are we doing? ● Mission: Provide the world's best healthcare for everyone ● Product: User-facing mobile primary care app ● Team: Building an awesome and diverse team ● Approach: State-of-the-art AI/ML + product/UX/clinical AI-based interaction AI + Health coaches AI + Doctors
  • 9. More data or better models? Really?
  • 10. More data or better models? Sometimes, it’s not about more data
  • 11. More data or better models? Norvig: “Google does not have better Algorithms only more Data” Many features/ low-bias models
  • 12. More data or better models? Sometimes you might not need all your “Big Data” 0 2 4 6 8 10 12 14 16 18 20 Number of Training Examples (in Millions) TestingAccuracy
  • 13. What about Deep Learning? Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal) 1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other texts (1991) Hidden Markov Model (1984) 1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The Extended Book” (1991) Negascout planning algorithm (1983) 2005 Google’s Arabic- and Chinese-to-English translation 1,8 trillion tokens from Google Web and News pages (collected in 2005) Statistical machine translation algorithm (1988) 2011 IBM watson become the world Jeopardy! Champion 8,6 million documents from Wikipedia, Wiktionary, Wikiquote, and Project Gutenberg (updated in 2005) Mixture-of-Experts algorithm (1991) 2014 Google’s GoogLeNet object classification at near-human performance ImageNet corpus of 1,5 million labeled images and 1,000 object catagories (2010) Convolution neural network algorithm (1989) 2015 Google’s Deepmind achieved human parity in playing 29 Atari games by learning general control from video Arcade Learning Environment dataset of over 50 Atari games (2013) Q-learning algorithm (1992) Average No. Of Years to Breakthrough 3 years 18 years The average elapsed time between key algorithm proposals and corresponding advances was about 18 years, whereas the average elapsed time between key dataset availabilities and corresponding advances was less than 3 years, or about 6 times faster.
  • 14. What about Deep Learning? Models and Recipes Pretrained Available models trained using OpenNMT → English → German → German → English → English Summarization → Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO More models coming soon: → Ubuntu Dialog Dataset → Syntactic Parsing → Image-to-Text
  • 15. More data and better data
  • 18. Occam’s razor Given two models that perform more or less equally, you should always prefer the less complex Deep Learning might not be preferred, even if it squeezes a +1% in accuracy
  • 19. Reasons to prefer a simpler model
  • 20. Reasons to prefer a simpler model …. There are many others System complexity Maintenance Explainability …. Figure 3: GoogLeNet network with all the bells and whistles
  • 21. A real-life example Goal: Supervised Classification → 40 features → 10k examples What did the ML Engineer choose? → Multi-layer ANN trained with Tensor Flow What was his proposed next step? → Try ConvNets Where is the problem? → Hours to train, already looking into distributing → There are much simpler approaches
  • 22. .... But, sometimes you do need a Complex Model Lesson 3
  • 23. Better models and features that “don’t work” E.g. You have a linear model and have been selecting and optimizing features for that model → More complex model with the same features -> improvement not likely → More expressive features -> improvement not likely More complex features may require a more complex model A more complex model may not show improvements with a feature set that is too simple
  • 25. Feature Engineering Example - Answer Ranking How are those dimensions translated into features? Features that relate to the answer Quality itself Interaction features (upvotes/downvotes, clicks, comments…) User features (e.g. expertise in topic) What is a good Quora answer? Truthful Reusable Provides explanation Well formatted ...
  • 26. Feature Engineering Properties of a well- behaved ML feature Output Mapping from features OutputOutput Most complex features Mapping from features Mapping from features Output Simplest features Features Hand – designed features Hand – designed program InputInputInputInput Rule - based systems Classic machine learning Representation learning Deep learning Fig; I. Goodfellow Deep Learning: Automating Feature Discovery Interpretable Reliable Reusable Transformable
  • 27. Deep Learning & Feature Engineering
  • 28. Deep Learning & Feature Architecture Engineering
  • 30. Supervised/Unsupervised Learning Unsupervised learning as dimensionality reduction E.g.1 Clustering + knn E.g.2 Matrix Factorization MF can be interpreted as Unsupervised: • Dimensionality Reduction a la PCA • Clustering (e.g. NMF) Supervised: • Labeled targets ~ regression Unsupervised learning as feature engineering The “magic” behind combining unsupervised/supervised learning
  • 31. Supervised/Unsupervised Learning One of the “tricks” in Deep Learning is how it combines unsupervised /supervised learning → E.g. Stacked Autoencoders → E.g. training of convolutional nets X1 X2 X3 X4 X5 X6 +1 +1 +1 ... ... ... Input Features I Features II Softmax classifier P(y=0 | x) P(y=1 | x) P(y=2 | x) Stacked Autoencoders Input 83x83 Layer 1 64x75x75 Layer 2 64@14x14 Layer 3 256@6x6 Layer 4 256@1x1 Output4 101 9x9 Convolution (64 kernels) 10x10 pooling 5x5 subsampling 9x9 Convolution (4096 kernels) 6x6 pooling 4x4 subsamp → Non-Linearity: half-wave rectification, shrinkage function, sigmoid → Pooling: average, L1, L2, max → Training: Supervised (1988-2006), Unsupervised+supervised (2006-now) Convolutional Network (CovNet) Neural Networks Supervised Unsupervised Superviseed Boost ing SVM Decis ion Tree Perc eptro n AE D-AE Neur al Net RNN Conv . Net RBM Spar se Codi ng DBN DBM GMM Baye s NP ΣΠ
  • 34. Ensembles Netflix Prize was won by an ensemble Most practical applications of ML run an ensemble → Initially Bellkor was using GDBTs → BigChaos introduced ANN-based ensemble → Why wouldn’t you? → At least as good as the best of your methods → Can add completely different approaches (e.g. CF and content-based) → You can use many different models at the ensemble layer: LR, GDBTs, RFs, ANNs...
  • 35. Ensembles & Feature Engineering Ensembles are the way to turn any model into a feature! E.g. Don’t know if the way to go is to use Factorization Machines, Tensor Factorization, or RNNs? → Treat each model as a “feature” → Feed them into an ensemble Sigmoid Rectified Linear Units Output Units Hidden Layers Dense Embeddings Sparse Features Wide Models Deep Models Wide & Deep Models
  • 36. There are biases in your data Lesson 7
  • 37. Defining training/testing data Training a simple binary classifier for good/bad answer → Defining positive and negative labels -> Non-trivial task → Is this a positive or a negative? → funny uninformative answer with many upvotes → short uninformative answer by a well-known expert in the field → very long informative answer that nobody reads/upvotes → informative answer with grammar/spelling mistakes → ...
  • 38. The curse of presentation bias Better options → Correcting for the probability a user will click on a position -> Attention models → Explore/exploit approaches such as MAB Simply treating things you show as negatives is not likely to work User can only click on what you decide to show → But, what you decide to show is the result of what your model predicted is good More likely to see Less likely
  • 40. Think about your models “in the wild” Lesson 8
  • 41. AI in the wild: Desired properties ● Easily extensible ○ Incrementally/iteratively learn from “human-in-the-loop” or from additional data ● Knows what it does not know ○ Models uncertainty in prediction ○ Enables fall-back to manual
  • 42. Assisted diagnosis in the wild 1. Extensibility a. Diagnosis as a ML task i. Expert systems as a prior b. Modeling less prevalent diseases i. Low-shot learning 2. Knowing what you don’t know b. Measures of uncertainty in prediction c. Allows fall-back to “physician-in-the-loop”
  • 43. Data and Models are great. You know what’s even better? The right evaluation approach! Lesson 9
  • 44. Offline/Online testing process Offline Experimentation Online Experimentation Initial Hypothesis Design AB Test Choose Control Deploy Prototype Observe Behavior Analyze Results Significant Improvements? Choose Model Train Model Test Offline Hypothesis Validated? Try different Model? Reformulated Hypothesis Deploy Feature NO YES NO YES NO YES
  • 45. Executing A/B tests Overall Evaluation Criteria (OEC) = e.g. member retention at Netflix → Use long-term metrics whenever possible → Short-term metrics can be informative and allow faster decisions ⁻ But, not always aligned with OEC Measure differences in metrics across statistically identical populations that each experience a different algorithm. Decisions on the product always data-driven
  • 46. Offline testing Measure model performance, using (IR) metrics Offline performance = indication to make decisions on follow-up A/B tests A critical (and mostly unsolved) issue is how offline metrics correlate with A/B test results.
  • 47. Do not underestimate the value of systems and frameworks Lesson 10
  • 48. ML vs Software Can you treat your ML infrastructure as you would your software one? → Yes and No You should apply best Software Engineering practices (e.g. encapsulation, abstraction, cohesion, low coupling…) However, Design Patterns for Machine Learning software are not well known/documented
  • 49. Software: the new frontier of ML?
  • 50. Your AI infrastructure will have two masters Lesson 11
  • 51. Machine Learning Infrastructure → Whenever you develop any ML infrastructure, you need to target two different modes: Mode 1: ML experimentation − Flexibility − Easy-to-use − Reusability Mode 2: ML production − All of the above + performance & scalability → Ideally you want the two modes to be as similar as possible → How to combine them?
  • 52. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 53. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 54. Machine Learning Infrastructure Good intermediate options → Have ML “researchers” experiment on Jupyter Notebooks using Python tools (scikit-learn, Pytorch, TF…). Use same tools in production whenever possible, implement optimized versions only when needed. → Implement abstraction layers on top of optimized implementations so they can be accessed from regular/friendly experimentation tools
  • 55. There is ML beyond Deep Learning Lesson 12
  • 56. Other ML Advances ● Factorization Machines ● Tensor Methods ● Non-parametric Bayesian models ● XGBoost ● Online Learning ● Reinforcement Learning ● Learning to rank ● ...
  • 57. Other very successful approaches
  • 58. Sometimes DL does not win
  • 60. 01. 02. 03. 04. 05. Choose the right metric Be thoughtful about your data Understand dependencies between data, models & systems Optimize only what matters, beware of biases Be thoughtful about : Your ML infrastructure/tools, About organizing your teams
  • 61. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)