SlideShare a Scribd company logo
Java and Deep Learning
Java Meetup SF Pivotal Labs
January 10, 2018
Oswald Campesato
Session Overview
partial intro/overview of AI/ML/DL
a simple neural network
linear regression
cost/activation functions
gradient descent/back propagation
CNNs and RNNs
Java code (CNN and TensorFlow)
The Data/AI Landscape
Gartner 2017: Deep Learning (YES!)
Neural Network with 3 Hidden Layers
The Official Start of AI (1956)
AI/ML/DL: How They Differ
Traditional AI (20th century):
based on collections of rules
Led to expert systems in the 1980s
The era of LISP and Prolog
AI/ML/DL: How They Differ
Machine Learning:
Started in the 1950s (approximate)
Alan Turing and “learning machines”
Data-driven (not rule-based)
Many types of algorithms
Involves optimization
AI/ML/DL: How They Differ
Deep Learning:
Started in the 1950s (approximate)
The “perceptron” (basis of NNs)
Data-driven (not rule-based)
large (even massive) data sets
Involves neural networks (CNNs: ~1970s)
Lots of heuristics
Heavily based on empirical results
The Rise of Deep Learning
Massive and inexpensive computing power
Huge volumes of data/Powerful algorithms
The “big bang” in 2009:
”deep-learning neural networks and NVidia GPUs"
Google Brain used NVidia GPUs (2009)
AI/ML/DL: Commonality
All of them involve a model
A model represents a system
Goal: a good predictive model
The model is based on:
Many rules (for AI)
data and algorithms (for ML)
large sets of data (for DL)
Clustering Example #1
Given some red dots and blue dots
Red dots are in the upper half plane
Blue dots in the lower half plane
How to detect if a point is red or blue?
Clustering Example #1
Clustering Example #1
Clustering Example #2
Given some red dots and blue dots
Red dots are inside a unit square
Blue dots are outside the unit square
How to detect if a point is red or blue?
Clustering Example #2
 Two input nodes X and Y
 One hidden layer with 4 nodes (one per line)
 X & Y weights are the (x,y) values of the inward pointing
perpendicular vector of each side
 The threshold values are the negative of the y-intercept
(or the x-intercept)
 The outbound weights are all equal to 1
 The threshold for the output node node is 4
Clustering Example #2
Clustering Example #2
Clustering Exercises #1
Describe an NN for a triangle
Describe an NN for a pentagon
Describe an NN for an n-gon (convex)
Describe an NN for an n-gon (non-convex)
Clustering Exercises #2
Create an NN for an OR gate
Create an NN for a NOR gate
Create an NN for an AND gate
Create an NN for a NAND gate
Create an NN for an XOR gate
=> requires TWO hidden layers
Clustering Exercises #3
Convert example #2 to a 3D cube
Clustering Example #2
A few points to keep in mind:
A “step” activation function (0 or 1)
No back propagation
No cost function
=> no learning involved
A 2D Linear Regression Model
Perform the following steps:
1) Start with a simple model (2 variables)
2) Generalize that model (n variables)
3) See how it might apply to a NN
Linear Regression Details
One of the simplest models in ML
Fits a line (y = m*x + b) to data in 2D
Finds best line by minimizing MSE:
m = average of x values (“mean”)
b also has a closed form solution
Linear Regression in 2D: graph
Sample Cost Function #1 (MSE)
Linear Regression: example #1
One feature (independent variable):
X = number of square feet
Predicted value (dependent variable):
Y = cost of a house
A very “coarse grained” model
We can devise a much better model
Linear Regression: example #2
Multiple features:
X1 = # of square feet
X2 = # of bedrooms
X3 = # of bathrooms (dependency?)
X4 = age of house
X5 = cost of nearby houses
X6 = corner lot (or not): Boolean
a much better model (6 features)
Linear Multivariate Analysis
General form of multivariate equation:
Y = w1*x1 + w2*x2 + . . . + wn*xn + b
w1, w2, . . . , wn are numeric values
x1, x2, . . . , xn are variables (features)
Properties of variables:
Can be independent (Naïve Bayes)
weak/strong dependencies can exist
Neural Network with 3 Hidden Layers
Neural Networks: equations
Node “values” in first hidden layer:
N1 = w11*x1+w21*x2+…+wn1*xn
N2 = w12*x1+w22*x2+…+wn2*xn
N3 = w13*x1+w23*x2+…+wn3*xn
. . .
Nn = w1n*x1+w2n*x2+…+wnn*xn
Similar equations for other pairs of layers
Neural Networks: Matrices
From inputs to first hidden layer:
Y1 = W1*X + B1 (X/Y1/B1: vectors; W1: matrix)
From first to second hidden layers:
Y2 = W2*X + B2 (X/Y2/B2: vectors; W2: matrix)
From second to third hidden layers:
Y3 = W3*X + B3 (X/Y3/B3: vectors; W3: matrix)
 Apply an “activation function” to y values
Neural Networks (general)
Multiple hidden layers:
Layer composition is your decision
Activation functions: sigmoid, tanh, RELU
Back propagation (1980s)
=> Initial weights: small random numbers
Euler’s Function
The sigmoid Activation Function
The tanh Activation Function
The ReLU Activation Function
The softmax Activation Function
Activation Functions in Python
import numpy as np
# Python sigmoid example:
z = 1/(1 + np.exp(, x)))
# Python tanh example:
z = np.tanh(,x));
# Python ReLU example:
z = np.maximum(0,, x))
What’s the “Best” Activation Function?
Initially: sigmoid was popular
Then: tanh became popular
Now: RELU is preferred (better results)
Softmax: for FC (fully connected) layers
NB: sigmoid and tanh are used in LSTMs
Even More Activation Functions!
Sample Cost Function #1 (MSE)
Sample Cost Function #2
Sample Cost Function #3
How to Select a Cost Function
1) Depends on the learning type:
=> supervised/unsupervised/RL
2) Depends on the activation function
3) Other factors
cross-entropy cost function for supervised
learning on multiclass classification
GD versus SGD
SGD (Stochastic Gradient Descent):
+ involves a SUBSET of the dataset
+ aka Minibatch Stochastic Gradient Descent
GD (Gradient Descent):
+ involves the ENTIRE dataset
More details:
Setting up Data & the Model
Normalize the data:
Subtract the ‘mean’ and divide by stddev
[Central Limit Theorem]
Initial weight values for NNs:
Random numbers in N(0,1)
More details:
What are Hyper Parameters?
higher level concepts about the model such as
complexity, or capacity to learn
Cannot be learned directly from the data in the
standard model training process
must be predefined
Hyper Parameters (examples)
# of hidden layers in a neural network
the learning rate (in many models)
the dropout rate
# of leaves or depth of a tree
# of latent factors in a matrix factorization
# of clusters in a k-means clustering
Hyper Parameter: dropout rate
"dropout" refers to dropping out units (both hidden
and visible) in a neural network
a regularization technique for reducing overfitting in
neural networks
prevents complex co-adaptations on training data
a very efficient way of performing model averaging
with neural networks
How Many Layers in a DNN?
Algorithm #1 (from Geoffrey Hinton):
1) add layers until you start overfitting your
training set
2) now add dropout or some another
regularization method
Algorithm #2 (Yoshua Bengio):
"Add layers until the test error does not improve
How Many Hidden Nodes in a DNN?
Based on a relationship between:
# of input and # of output nodes
Amount of training data available
Complexity of the cost function
The training algorithm
CNNs versus RNNs
CNNs (Convolutional NNs):
Good for image processing
2000: CNNs processed 10-20% of all checks
=> Approximately 60% of all NNs
RNNs (Recurrent NNs):
Good for NLP and audio
CNNs: convolution-pooling (1)
CNNs: Convolution Calculations
CNNs: Convolution Matrices (examples)
CNNs: Convolution Matrices (examples)
Edge detect:
CNNs: Sample Convolutions/Filters
CNNs: Max Pooling Example
CNNs: convolution and pooling (2)
Sample CNN in Keras (fragment)
 from keras.models import Sequential
 from keras.layers.core import Dense, Dropout, Flatten, Activation
 from keras.layers.convolutional import Conv2D, MaxPooling2D
 from keras.optimizers import Adadelta
 input_shape = (3, 32, 32)
 nb_classes = 10
 model = Sequential()
 model.add(Conv2D(32, (3, 3), padding='same’,
 model.add(Activation('relu'))
 model.add(Conv2D(32, (3, 3)))
 model.add(Activation('relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
GANs: Generative Adversarial Networks
GANs: Generative Adversarial Networks
Make imperceptible changes to images
Can consistently defeat all NNs
Can have extremely high error rate
Some images create optical illusions
GANs: Generative Adversarial Networks
Create your own GANs:
GANs from MNIST:
GANs: Generative Adversarial Networks
GANs, Graffiti, and Art:
GANs and audio:
Houdini algorithm:
Deep Learning Playground
TF playground home page:
Demo #1:
Converts playground to TypeScript
Java and DL/ML Frameworks
Deeplearning4j: Pure Java framework for DL
“Statistical Machine Intelligence and Learning Engine”
"outperforms R, Python, Spark, H2O significantly”
Weka (“WAY-kuh”):
IBM neuroph:
Deeplearning4j Library
Open source, distributed library for the JVM
Written in Java and Scala (GPU support)
Integrates with Hadoop and Spark
Deeplearning4j Library
Basic set-up steps (command line):
mkdir dl4j-examples
cd dl4j-examples
git clone
Deeplearning4j Library
Set-up steps for IntelliJ
 File > Import Project (or New Project from Existing Sources)
 Select the directory with the DL4J examples.
 Select Maven build tool in the next window
 Check the following two boxes:
 1) "Search for projects recursively"
 2) "Import Maven projects automatically” (Next)
 click on "+" sign (bottom of window) to JDK/SDK
 Click through until you reach "Finish"
Smile Framework
Support for many algorithms
classification, regression, clustering
association rule mining, feature selection
manifold learning, multidimensional scaling
genetic algorithm, missing value imputation
efficient nearest neighbor search
Smile Framework
Natural Language Processing:
Tokenizers, stemming, phrase detection
part-of-speech tagging, keyword extraction
named entity recognition, sentiment analysis
relevance ranking, taxomony
Smile Framework
Mathematics and Statistics
linear algebra (LU decomposition)
Cholesk decomposition, QR decomposition
eigenvalue decomposition
singular value decomposition
band matrix, and sparse matrix
tests: t-test, F-test, chi-square test
correlation test (Pearson, Spearman, Kendall)
Kolmogorov-Smirnov test
distributions/random number generators
interpolation, sorting, wavelet, plot
Smile Framework
Random Forest (SMILE Scala API):
val data = read.arff("iris.arff", 4)
val (x, y) = data.unzipInt
val rf = randomForest(x, y)
println(s"OOB error = ${rf.error}")
What is TensorFlow?
An open source framework for ML and DL
A “computation” graph
Created by Google (released 11/2015)
Evolved from Google Brain
Linux and Mac OS X support (VM for Windows)
TF home page:
What is TensorFlow?
Support for Python, Java, C++
TPUs available for faster processing
Can be embedded in Python scripts
Installation: pip install tensorflow
TensorFlow cluster:
What is a Tensor?
TF tensors are n-dimensional arrays
TF tensors are very similar to numpy ndarrays
scalar number: a zeroth-order tensor
vector: a first-order tensor
matrix: a second-order tensor
3-dimensional array: a 3rd order tensor
TensorFlow: constants (immutable)
 import tensorflow as tf #
 aconst = tf.constant(3.0)
 print(aconst)
# output: Tensor("Const:0", shape=(), dtype=float32)
 sess = tf.Session()
 print(
# output: 3.0
 sess.close()
 # => there's a better way…
TensorFlow: constants
import tensorflow as tf #
aconst = tf.constant(3.0)
Automatically close “sess”
with tf.Session() as sess:
 print(
TensorFlow Arithmetic
import tensorflow as tf #
a = tf.add(4, 2)
b = tf.subtract(8, 6)
c = tf.multiply(a, 3)
d = tf.div(a, 6)
with tf.Session() as sess:
print( # 6
print( # 2
print( # 18
print( # 1
TensorFlow Arithmetic Methods
import tensorflow as tf
PI = 3.141592
sess = tf.Session()
print(, tf.cos(PI/4.))))
TensorFlow Arithmetic Methods
Output from
TensorFlow: placeholders example
import tensorflow as tf #
a = tf.placeholder("float")
b = tf.placeholder("float")
c = tf.multiply(a,b)
# initialize a and b:
feed_dict = {a:2, b:3}
# multiply a and b:
with tf.Session() as sess:
print(, feed_dict))
TensorFlow fetch/feed_dict
 import tensorflow as tf #
 # y = W*x + b: W and x are 1d arrays
 W = tf.constant([10,20], name=’W’)
 x = tf.placeholder(tf.int32, name='x')
 b = tf.placeholder(tf.int32, name='b')
 Wx = tf.multiply(W, x, name='Wx')
 y = tf.add(Wx, b, name=’y’)
TensorFlow fetch/feed_dict
with tf.Session() as sess:
print("Result 1: Wx = ",, feed_dict={x:[5,10]}))
print("Result 2: y = ",, feed_dict={x:[5,10], b:[15,25]}))
Result 1: Wx = [50 200]
Result 2: y = [65 225]
TensorFlow Arithmetic Expressions
import tensorflow as tf #
x = tf.constant(5,name="x")
y = tf.constant(8,name="y")
z = tf.Variable(2*x+3*y, name="z”)
model = tf.global_variables_initializer()
with tf.Session() as session:
writer = tf.summary.FileWriter(”./tf_logs",session.graph)
print 'z = ', # => z = 34
# tensorboard –logdir=./tf_logs
TensorFlow Eager Execution
An imperative interface to TF (experimental)
Fast debugging & immediate run-time errors
Eager execution is not included in v1.4 of TF
build TF from source or install the nightly build
pip install tf-nightly # CPU
pip install tf-nightly-gpu #GPU
TensorFlow Eager Execution
integration with Python tools
Supports dynamic models + Python control flow
support for custom and higher-order gradients
Supports most TensorFlow operations
TensorFlow Eager Execution
import tensorflow as tf #
import tensorflow.contrib.eager as tfe
x = [[2.]]
m = tf.matmul(x, x)
Android and Deep Learning
TensorFlow Lite (announced 2017 Google I/O)
A subset of the TensorFlow APIs (which ones?)
Provides “regular” TensorFlow APIs for apps
Does not require Python scripts (?)
Deep Learning and Art
“Convolutional Blending” images:
=> 19-layer Convolutional Neural Network
Prisma: Android app with CNN
What Do I Learn Next?
 PGMs (Probabilistic Graphical Models)
 MC (Markov Chains)
 MCMC (Markov Chains Monte Carlo)
 HMMs (Hidden Markov Models)
 RL (Reinforcement Learning)
 Hopfield Nets
 Neural Turing Machines
 Autoencoders
 Hypernetworks
 Pixel Recurrent Neural Networks
 Bayesian Neural Networks
 SVMs
About Me: Recent Books
1) HTML5 Canvas and CSS3 Graphics (2013)
2) jQuery, CSS3, and HTML5 for Mobile (2013)
3) HTML5 Pocket Primer (2013)
4) jQuery Pocket Primer (2013)
5) HTML5 Mobile Pocket Primer (2014)
6) D3 Pocket Primer (2015)
7) Python Pocket Primer (2015)
8) SVG Pocket Primer (2016)
9) CSS3 Pocket Primer (2016)
10) Android Pocket Primer (2017)
11) Angular Pocket Primer (2017)
12) Data Cleaning Pocket Primer (2018)
13) RegEx Pocket Primer (2018)
About Me: Training
=> Deep Learning. Keras, and TensorFlow:
=> Mobile and TensorFlow Lite
=> R and Deep Learning (Keras and TensorFlow)
=> Android for Beginners

More Related Content

What's hot

Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
Larry Guo
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
Oswald Campesato
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Deep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGLDeep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGL
Oswald Campesato
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Universitat Politècnica de Catalunya
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
Oswald Campesato
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Universitat Politècnica de Catalunya
RNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq ModelsRNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq Models
Emory NLP
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Universitat Politècnica de Catalunya
(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2
Serhii Havrylov
Deep Learning in Your Browser
Deep Learning in Your BrowserDeep Learning in Your Browser
Deep Learning in Your Browser
Oswald Campesato
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for Trading
Larry Guo
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Universitat Politècnica de Catalunya
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
Universitat Politècnica de Catalunya
Introduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlowIntroduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlow
Paolo Tomeo
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya

What's hot (20)

Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Deep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGLDeep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGL
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
RNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq ModelsRNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq Models
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2(Kpi summer school 2015) theano tutorial part2
(Kpi summer school 2015) theano tutorial part2
Deep Learning in Your Browser
Deep Learning in Your BrowserDeep Learning in Your Browser
Deep Learning in Your Browser
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for Trading
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
Introduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlowIntroduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlow
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018

Similar to Java and Deep Learning (Introduction)

Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
Oswald Campesato
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Oswald Campesato
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
Oswald Campesato
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
Oswald Campesato
Introduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlowIntroduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlow
Oswald Campesato
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
TensorFlow in Your Browser
TensorFlow in Your BrowserTensorFlow in Your Browser
TensorFlow in Your Browser
Oswald Campesato
Intro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.jsIntro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.js
Oswald Campesato
H2 o berkeleydltf
H2 o berkeleydltfH2 o berkeleydltf
H2 o berkeleydltf
Oswald Campesato
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
Introduction to Deep Learning, Keras, and Tensorflow
Introduction to Deep Learning, Keras, and TensorflowIntroduction to Deep Learning, Keras, and Tensorflow
Introduction to Deep Learning, Keras, and Tensorflow
Oswald Campesato
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
Oswald Campesato
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
Muhanad Al-khalisy
Neural networks and google tensor flow
Neural networks and google tensor flowNeural networks and google tensor flow
Neural networks and google tensor flow
Shannon McCormick
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
Nicholas McClure
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
Deep learning
Deep learningDeep learning
Deep learning
Aman Kamboj
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
Introduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep LearningIntroduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep Learning
ali alemi

Similar to Java and Deep Learning (Introduction) (20)

Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlowIntroduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlow
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
TensorFlow in Your Browser
TensorFlow in Your BrowserTensorFlow in Your Browser
TensorFlow in Your Browser
Intro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.jsIntro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.js
H2 o berkeleydltf
H2 o berkeleydltfH2 o berkeleydltf
H2 o berkeleydltf
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and Tensorflow
Introduction to Deep Learning, Keras, and TensorflowIntroduction to Deep Learning, Keras, and Tensorflow
Introduction to Deep Learning, Keras, and Tensorflow
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
Neural networks and google tensor flow
Neural networks and google tensor flowNeural networks and google tensor flow
Neural networks and google tensor flow
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
Deep learning
Deep learningDeep learning
Deep learning
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Introduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep LearningIntroduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep Learning

More from Oswald Campesato

Working with (TF 2)
Working with (TF 2)Working with (TF 2)
Working with (TF 2)
Oswald Campesato
Introduction to TensorFlow 2 and Keras
Introduction to TensorFlow 2 and KerasIntroduction to TensorFlow 2 and Keras
Introduction to TensorFlow 2 and Keras
Oswald Campesato
Introduction to TensorFlow 2
Introduction to TensorFlow 2Introduction to TensorFlow 2
Introduction to TensorFlow 2
Oswald Campesato
Introduction to TensorFlow 2
Introduction to TensorFlow 2Introduction to TensorFlow 2
Introduction to TensorFlow 2
Oswald Campesato
"An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning""An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning"
Oswald Campesato
Introduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-ProgrammersIntroduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-Programmers
Oswald Campesato
Introduction to Kotlin
Introduction to KotlinIntroduction to Kotlin
Introduction to Kotlin
Oswald Campesato

More from Oswald Campesato (7)

Working with (TF 2)
Working with (TF 2)Working with (TF 2)
Working with (TF 2)
Introduction to TensorFlow 2 and Keras
Introduction to TensorFlow 2 and KerasIntroduction to TensorFlow 2 and Keras
Introduction to TensorFlow 2 and Keras
Introduction to TensorFlow 2
Introduction to TensorFlow 2Introduction to TensorFlow 2
Introduction to TensorFlow 2
Introduction to TensorFlow 2
Introduction to TensorFlow 2Introduction to TensorFlow 2
Introduction to TensorFlow 2
"An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning""An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning"
Introduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-ProgrammersIntroduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-Programmers
Introduction to Kotlin
Introduction to KotlinIntroduction to Kotlin
Introduction to Kotlin

Recently uploaded

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors

Recently uploaded (20)

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors

Java and Deep Learning (Introduction)

  • 1. Java and Deep Learning (Introduction) Java Meetup SF Pivotal Labs January 10, 2018 Oswald Campesato
  • 2. Session Overview partial intro/overview of AI/ML/DL hyper-parameters a simple neural network linear regression cost/activation functions gradient descent/back propagation CNNs and RNNs Java code (CNN and TensorFlow)
  • 4. Gartner 2017: Deep Learning (YES!)
  • 5. Neural Network with 3 Hidden Layers
  • 6. The Official Start of AI (1956)
  • 7. AI/ML/DL: How They Differ Traditional AI (20th century): based on collections of rules Led to expert systems in the 1980s The era of LISP and Prolog
  • 8. AI/ML/DL: How They Differ Machine Learning: Started in the 1950s (approximate) Alan Turing and “learning machines” Data-driven (not rule-based) Many types of algorithms Involves optimization
  • 9. AI/ML/DL: How They Differ Deep Learning: Started in the 1950s (approximate) The “perceptron” (basis of NNs) Data-driven (not rule-based) large (even massive) data sets Involves neural networks (CNNs: ~1970s) Lots of heuristics Heavily based on empirical results
  • 10. The Rise of Deep Learning Massive and inexpensive computing power Huge volumes of data/Powerful algorithms The “big bang” in 2009: ”deep-learning neural networks and NVidia GPUs" Google Brain used NVidia GPUs (2009)
  • 11. AI/ML/DL: Commonality All of them involve a model A model represents a system Goal: a good predictive model The model is based on: Many rules (for AI) data and algorithms (for ML) large sets of data (for DL)
  • 12. Clustering Example #1 Given some red dots and blue dots Red dots are in the upper half plane Blue dots in the lower half plane How to detect if a point is red or blue?
  • 15. Clustering Example #2 Given some red dots and blue dots Red dots are inside a unit square Blue dots are outside the unit square How to detect if a point is red or blue?
  • 16. Clustering Example #2  Two input nodes X and Y  One hidden layer with 4 nodes (one per line)  X & Y weights are the (x,y) values of the inward pointing perpendicular vector of each side  The threshold values are the negative of the y-intercept (or the x-intercept)  The outbound weights are all equal to 1  The threshold for the output node node is 4
  • 19. Clustering Exercises #1 Describe an NN for a triangle Describe an NN for a pentagon Describe an NN for an n-gon (convex) Describe an NN for an n-gon (non-convex)
  • 20. Clustering Exercises #2 Create an NN for an OR gate Create an NN for a NOR gate Create an NN for an AND gate Create an NN for a NAND gate Create an NN for an XOR gate => requires TWO hidden layers
  • 21. Clustering Exercises #3 Convert example #2 to a 3D cube
  • 22. Clustering Example #2 A few points to keep in mind: A “step” activation function (0 or 1) No back propagation No cost function => no learning involved
  • 23. A 2D Linear Regression Model Perform the following steps: 1) Start with a simple model (2 variables) 2) Generalize that model (n variables) 3) See how it might apply to a NN
  • 24. Linear Regression Details One of the simplest models in ML Fits a line (y = m*x + b) to data in 2D Finds best line by minimizing MSE: m = average of x values (“mean”) b also has a closed form solution
  • 25. Linear Regression in 2D: graph
  • 27. Linear Regression: example #1 One feature (independent variable): X = number of square feet Predicted value (dependent variable): Y = cost of a house A very “coarse grained” model We can devise a much better model
  • 28. Linear Regression: example #2 Multiple features: X1 = # of square feet X2 = # of bedrooms X3 = # of bathrooms (dependency?) X4 = age of house X5 = cost of nearby houses X6 = corner lot (or not): Boolean a much better model (6 features)
  • 29. Linear Multivariate Analysis General form of multivariate equation: Y = w1*x1 + w2*x2 + . . . + wn*xn + b w1, w2, . . . , wn are numeric values x1, x2, . . . , xn are variables (features) Properties of variables: Can be independent (Naïve Bayes) weak/strong dependencies can exist
  • 30. Neural Network with 3 Hidden Layers
  • 31. Neural Networks: equations Node “values” in first hidden layer: N1 = w11*x1+w21*x2+…+wn1*xn N2 = w12*x1+w22*x2+…+wn2*xn N3 = w13*x1+w23*x2+…+wn3*xn . . . Nn = w1n*x1+w2n*x2+…+wnn*xn Similar equations for other pairs of layers
  • 32. Neural Networks: Matrices From inputs to first hidden layer: Y1 = W1*X + B1 (X/Y1/B1: vectors; W1: matrix) From first to second hidden layers: Y2 = W2*X + B2 (X/Y2/B2: vectors; W2: matrix) From second to third hidden layers: Y3 = W3*X + B3 (X/Y3/B3: vectors; W3: matrix)  Apply an “activation function” to y values
  • 33. Neural Networks (general) Multiple hidden layers: Layer composition is your decision Activation functions: sigmoid, tanh, RELU Back propagation (1980s) => Initial weights: small random numbers
  • 39. Activation Functions in Python import numpy as np ... # Python sigmoid example: z = 1/(1 + np.exp(, x))) ... # Python tanh example: z = np.tanh(,x)); # Python ReLU example: z = np.maximum(0,, x))
  • 40. What’s the “Best” Activation Function? Initially: sigmoid was popular Then: tanh became popular Now: RELU is preferred (better results) Softmax: for FC (fully connected) layers NB: sigmoid and tanh are used in LSTMs
  • 41. Even More Activation Functions!  8/comprehensive-list-of-activation-functions-in- neural-networks-with-pros-cons  science/activation-functions-and-its-types-which- is-better-a9a5310cc8f  layer-neural-networks-with-sigmoid-function- deep-learning-for-rookies-2-bf464f09eb7f
  • 45. How to Select a Cost Function 1) Depends on the learning type: => supervised/unsupervised/RL 2) Depends on the activation function 3) Other factors Example: cross-entropy cost function for supervised learning on multiclass classification
  • 46. GD versus SGD SGD (Stochastic Gradient Descent): + involves a SUBSET of the dataset + aka Minibatch Stochastic Gradient Descent GD (Gradient Descent): + involves the ENTIRE dataset More details:
  • 47. Setting up Data & the Model Normalize the data: Subtract the ‘mean’ and divide by stddev [Central Limit Theorem] Initial weight values for NNs: Random numbers in N(0,1) More details:
  • 48. What are Hyper Parameters? higher level concepts about the model such as complexity, or capacity to learn Cannot be learned directly from the data in the standard model training process must be predefined
  • 49. Hyper Parameters (examples) # of hidden layers in a neural network the learning rate (in many models) the dropout rate # of leaves or depth of a tree # of latent factors in a matrix factorization # of clusters in a k-means clustering
  • 50. Hyper Parameter: dropout rate "dropout" refers to dropping out units (both hidden and visible) in a neural network a regularization technique for reducing overfitting in neural networks prevents complex co-adaptations on training data a very efficient way of performing model averaging with neural networks
  • 51. How Many Layers in a DNN? Algorithm #1 (from Geoffrey Hinton): 1) add layers until you start overfitting your training set 2) now add dropout or some another regularization method Algorithm #2 (Yoshua Bengio): "Add layers until the test error does not improve anymore.”
  • 52. How Many Hidden Nodes in a DNN? Based on a relationship between: # of input and # of output nodes Amount of training data available Complexity of the cost function The training algorithm
  • 53. CNNs versus RNNs CNNs (Convolutional NNs): Good for image processing 2000: CNNs processed 10-20% of all checks => Approximately 60% of all NNs RNNs (Recurrent NNs): Good for NLP and audio
  • 56. CNNs: Convolution Matrices (examples) Sharpen: Blur:
  • 57. CNNs: Convolution Matrices (examples) Edge detect: Emboss:
  • 59. CNNs: Max Pooling Example
  • 60. CNNs: convolution and pooling (2)
  • 61. Sample CNN in Keras (fragment)  from keras.models import Sequential  from keras.layers.core import Dense, Dropout, Flatten, Activation  from keras.layers.convolutional import Conv2D, MaxPooling2D  from keras.optimizers import Adadelta  input_shape = (3, 32, 32)  nb_classes = 10  model = Sequential()  model.add(Conv2D(32, (3, 3), padding='same’, input_shape=input_shape))  model.add(Activation('relu'))  model.add(Conv2D(32, (3, 3)))  model.add(Activation('relu'))  model.add(MaxPooling2D(pool_size=(2, 2)))  model.add(Dropout(0.25))
  • 63. GANs: Generative Adversarial Networks Make imperceptible changes to images Can consistently defeat all NNs Can have extremely high error rate Some images create optical illusions  of-using-generative-adversarial-networks-a-type-of- neural-network
  • 64. GANs: Generative Adversarial Networks Create your own GANs: beginners GANs from MNIST:
  • 65. GANs: Generative Adversarial Networks GANs, Graffiti, and Art: machine-learning-models/ GANs and audio: everything-it-hears Houdini algorithm:
  • 66. Deep Learning Playground TF playground home page: Demo #1: playground Converts playground to TypeScript
  • 67. Java and DL/ML Frameworks Deeplearning4j: Pure Java framework for DL SMILE: “Statistical Machine Intelligence and Learning Engine” "outperforms R, Python, Spark, H2O significantly” Weka (“WAY-kuh”): IBM neuroph:
  • 68. Deeplearning4j Library Open source, distributed library for the JVM Written in Java and Scala (GPU support) Integrates with Hadoop and Spark 
  • 69. Deeplearning4j Library Basic set-up steps (command line): mkdir dl4j-examples cd dl4j-examples git clone examples
  • 70. Deeplearning4j Library Set-up steps for IntelliJ  File > Import Project (or New Project from Existing Sources)  Select the directory with the DL4J examples.  Select Maven build tool in the next window  Check the following two boxes:  1) "Search for projects recursively"  2) "Import Maven projects automatically” (Next)  click on "+" sign (bottom of window) to JDK/SDK  Click through until you reach "Finish"
  • 71. Smile Framework Support for many algorithms classification, regression, clustering association rule mining, feature selection manifold learning, multidimensional scaling genetic algorithm, missing value imputation efficient nearest neighbor search
  • 72. Smile Framework Natural Language Processing: Tokenizers, stemming, phrase detection part-of-speech tagging, keyword extraction named entity recognition, sentiment analysis relevance ranking, taxomony
  • 73. Smile Framework Mathematics and Statistics linear algebra (LU decomposition) Cholesk decomposition, QR decomposition eigenvalue decomposition singular value decomposition band matrix, and sparse matrix tests: t-test, F-test, chi-square test correlation test (Pearson, Spearman, Kendall) Kolmogorov-Smirnov test distributions/random number generators interpolation, sorting, wavelet, plot
  • 74. Smile Framework Random Forest (SMILE Scala API): val data = read.arff("iris.arff", 4) val (x, y) = data.unzipInt val rf = randomForest(x, y) println(s"OOB error = ${rf.error}") rf.predict(x(0))
  • 75. What is TensorFlow? An open source framework for ML and DL A “computation” graph Created by Google (released 11/2015) Evolved from Google Brain Linux and Mac OS X support (VM for Windows) TF home page:
  • 76. What is TensorFlow? Support for Python, Java, C++ TPUs available for faster processing Can be embedded in Python scripts Installation: pip install tensorflow TensorFlow cluster:
  • 77. What is a Tensor? TF tensors are n-dimensional arrays TF tensors are very similar to numpy ndarrays scalar number: a zeroth-order tensor vector: a first-order tensor matrix: a second-order tensor 3-dimensional array: a 3rd order tensor  examples
  • 78. TensorFlow: constants (immutable)  import tensorflow as tf #  aconst = tf.constant(3.0)  print(aconst) # output: Tensor("Const:0", shape=(), dtype=float32)  sess = tf.Session()  print( # output: 3.0  sess.close()  # => there's a better way…
  • 79. TensorFlow: constants import tensorflow as tf # aconst = tf.constant(3.0) print(aconst) Automatically close “sess” with tf.Session() as sess:  print(
  • 80. TensorFlow Arithmetic import tensorflow as tf # a = tf.add(4, 2) b = tf.subtract(8, 6) c = tf.multiply(a, 3) d = tf.div(a, 6) with tf.Session() as sess: print( # 6 print( # 2 print( # 18 print( # 1
  • 81. TensorFlow Arithmetic Methods import tensorflow as tf PI = 3.141592 sess = tf.Session() print(,8))) print(,8.0))) print( print( print(, tf.cos(PI/4.))))
  • 82. TensorFlow Arithmetic Methods Output from 1 2.0 6.27833e-07 -1.0 1.0
  • 83. TensorFlow: placeholders example import tensorflow as tf # a = tf.placeholder("float") b = tf.placeholder("float") c = tf.multiply(a,b) # initialize a and b: feed_dict = {a:2, b:3} # multiply a and b: with tf.Session() as sess: print(, feed_dict))
  • 84. TensorFlow fetch/feed_dict  import tensorflow as tf #  # y = W*x + b: W and x are 1d arrays  W = tf.constant([10,20], name=’W’)  x = tf.placeholder(tf.int32, name='x')  b = tf.placeholder(tf.int32, name='b')  Wx = tf.multiply(W, x, name='Wx')  y = tf.add(Wx, b, name=’y’)
  • 85. TensorFlow fetch/feed_dict with tf.Session() as sess: print("Result 1: Wx = ",, feed_dict={x:[5,10]})) print("Result 2: y = ",, feed_dict={x:[5,10], b:[15,25]})) Result 1: Wx = [50 200] Result 2: y = [65 225]
  • 86. TensorFlow Arithmetic Expressions import tensorflow as tf # x = tf.constant(5,name="x") y = tf.constant(8,name="y") z = tf.Variable(2*x+3*y, name="z”) model = tf.global_variables_initializer() with tf.Session() as session: writer = tf.summary.FileWriter(”./tf_logs",session.graph) print 'z = ', # => z = 34 # tensorboard –logdir=./tf_logs
  • 87. TensorFlow Eager Execution An imperative interface to TF (experimental) Fast debugging & immediate run-time errors Eager execution is not included in v1.4 of TF build TF from source or install the nightly build pip install tf-nightly # CPU pip install tf-nightly-gpu #GPU
  • 88. TensorFlow Eager Execution integration with Python tools Supports dynamic models + Python control flow support for custom and higher-order gradients Supports most TensorFlow operations  execution-imperative-define-by.html
  • 89. TensorFlow Eager Execution import tensorflow as tf # import tensorflow.contrib.eager as tfe tfe.enable_eager_execution() x = [[2.]] m = tf.matmul(x, x) print(m)
  • 90. Android and Deep Learning TensorFlow Lite (announced 2017 Google I/O) A subset of the TensorFlow APIs (which ones?) Provides “regular” TensorFlow APIs for apps Does not require Python scripts (?)
  • 91. Deep Learning and Art “Convolutional Blending” images: => 19-layer Convolutional Neural Network Prisma: Android app with CNN  engineer-taught-an-algorithm-to-make-train-footage- and-its-hypnotic
  • 92. What Do I Learn Next?  PGMs (Probabilistic Graphical Models)  MC (Markov Chains)  MCMC (Markov Chains Monte Carlo)  HMMs (Hidden Markov Models)  RL (Reinforcement Learning)  Hopfield Nets  Neural Turing Machines  Autoencoders  Hypernetworks  Pixel Recurrent Neural Networks  Bayesian Neural Networks  SVMs
  • 93. About Me: Recent Books 1) HTML5 Canvas and CSS3 Graphics (2013) 2) jQuery, CSS3, and HTML5 for Mobile (2013) 3) HTML5 Pocket Primer (2013) 4) jQuery Pocket Primer (2013) 5) HTML5 Mobile Pocket Primer (2014) 6) D3 Pocket Primer (2015) 7) Python Pocket Primer (2015) 8) SVG Pocket Primer (2016) 9) CSS3 Pocket Primer (2016) 10) Android Pocket Primer (2017) 11) Angular Pocket Primer (2017) 12) Data Cleaning Pocket Primer (2018) 13) RegEx Pocket Primer (2018)
  • 94. About Me: Training => Deep Learning. Keras, and TensorFlow: => Mobile and TensorFlow Lite => R and Deep Learning (Keras and TensorFlow) => Android for Beginners