BigML Summer 2017 Release

Introducing
Deepnets
BigML Summer 2017 Release

BigML, Inc 2Summer Release Webinar - October 2017
Summer 2017 Release
CHARLES PARKER, PH.D. - VP of Machine
Learning Algorithms
Please enter questions into chat box – We will
answer some via chat and others at the end of the
session
https://bigml.com/releases/summer-2017
ATAKAN CETINSOY - VP of Predictive Applications
Resources
Moderator
Speaker
Contact support@bigml.com
Twitter @bigmlcom
Questions

Power To The People!
• Why another supervised learning algorithm?
• Deep neural networks have been shown to be
state of the art in several niche applications
• Vision
• Speech recognition
• NLP
• While powerful, these networks have historically
been diﬃcult for novices to train

Goals of BigML Deepnets
• What BigML Deepnets are not (yet)
• Convolutional networks
• Recurrent networks (e.g., LSTM Networks)
• These solve a particular type of sub-problem, and
are carefully engineered by experts to do so
• Can we bring some of the power of Deep Neural
Networks to your problem, even if you have no
deep learning expertise?
• Let’s try to separate deep neural network myths
from realities

Myth #1
Deep neural networks are the next step in evolution,
destined to perfect humanity or destroy it utterly.

Some Weaknesses
• Trees
• Pro: Massive representational power that expands as the data
gets larger; efficient search through this space
• Con: Difficult to represent smooth functions and functions of
many variables
• Ensembles mitigate some of these difficulties
• Logistic Regression
• Pro: Some smooth, multivariate, functions are not a problem;
fast optimization
• Con: Parametric - If decision boundary is nonlinear, tough luck
• Can these be mitigated?

Logistic Level Up
Outputs
Inputs

Logistic Level Up
wi
Class “a”, logistic(w, b)

Logistic Level Up
Outputs
Inputs
Hidden layer

Logistic Level Up
Hidden node 1,

logistic(w, b)

Logistic Level Up
Hidden node 1,

logistic(w, b)
n
hidden nodes?

Logistic Level Up
Hidden node 1,

logistic(w, b)
n
hidden

layers?

Logistic Level Up
Hidden node 1,

logistic(w, b)

Myth #2
Deep neural networks are great for the established
marquee applications, but less interesting for general use.

Parameter Paralysis
Parameter Name Possible Values
Descent Algorithm Adam, RMSProp, Adagrad, Momentum, FTRL
Number of hidden layers 0 - 32
Activation Function (per layer) relu, tanh, sigmoid, softplus, etc.
Number of nodes (per layer) 1 - 8192
Learning Rate 0 - 1
Dropout Rate 0 - 1
Batch size 1 - 1024
Batch Normalization True, False
Learn Residuals True, False
Missing Numerics True, False
Objective weights Weight per class
. . . and that’s ignoring the parameters that are
speciﬁc to the descent algorithm.

What Can We Do?
• Clearly there are too many parameters to fuss with
• Setting them takes signiﬁcant expert knowledge
• Solution: Metalearning (a good initial guess)
• Solution: Network search (try a bunch)

Metalearning
We trained 296,748 deep neural networks
so you don’t have to!

Bayesian Parameter Optimization
Model and EvaluateStructure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6

Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75

Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
0.48

Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
0.48
0.91

Structure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
0.48
0.91
Machine Learning!
Structure → performance
Model and Evaluate

Benchmarking
• The ML world is ﬁlled with crummy benchmarks
• Not enough datasets
• No cross-validation
• Only one metric
• Solution: Roll our own
• 50+ datasets, 5 replications of 10-fold CV
• 10 diﬀerent metrics
• 30+ competing algorithms (R, scikit-learn, weka, xgboost)
http://www.clparker.org/ml_benchmark/

Myth #3
Deep neural networks have such spectacular performance
that all other supervised learning techniques are now irrelevant

Caveat Emptor
• Things that make deep learning less useful:
• Small data (where that could still be thousands of instances)
• Problems where you could beneﬁt by iterating quickly (better
features always beats better models)
• Problems that are easy, or for which top-of-the-line
performance isn’t absolutely critical
• Remember deep learning is just another sort
of supervised learning algorithm
“…deep learning has existed in the neural network community for over 20 years. Recent advances are
driven by some relatively minor improvements in algorithms and models and by the availability of large
data sets and much more powerful collections of computers.” — Stuart Russell

https://bigml.com/releases/summer-2017
Learn about Deepnets

Questions?
@bigmlcom support@bigml.com

BigML Summer 2017 Release

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to BigML Summer 2017 Release

Similar to BigML Summer 2017 Release (20)

More from BigML, Inc

More from BigML, Inc (20)

Recently uploaded

Recently uploaded (20)

BigML Summer 2017 Release