Many state of the art machine learning applications today are based on artifical neural networks. In this talk we explore several commonly used neural network architectures. We identify the ideas behind their design, describe their topologies, outline their properties and discuss their use.
You might be enjoy this talk if you are interested in:
* Discovering some of the popular neural network types
* Learning about their design and how they work
* Understanding what are they are good for
8. Disadvantages
●
Large amount of training data
●
Long time to train
●
Computationally expensive
●
Hard to interpret - black box
●
Many possible architectures
10. Perceptron
●
Simplified model of a neuron (1957)
●
Linear binary classifier
●
Multiple numeric inputs
●
One boolean output
●
Linearly separable classes only
14. Multi-layer perceptron
●
Nonlinear classification or regression
●
Inputs
●
Features
●
Hidden layers
●
Parallel neurons feeding the next layer
●
Dot product
●
Sigmoid activation function
●
Output layer
●
Arbitrary activation function
15. Training
●
Calculate the output
●
Apply differentiable loss function
●
Must be differentiable
●
Should be minimized – optimization problem
●
Gradient descent to update the weights
●
Proportional to the learning rate
●
Stochastic approximations
16. Training
●
Backpropagation (1974)
●
Derivative of the loss with regard to the weights
●
Apply to previous layers by using the chain rule
●
Regularization
●
Reduce overfitting
●
L1 or L2 norm
●
Dropout – ignore random neurons during training
20. Convolutional networks
●
Convolutional layer
●
Filter that scans the image – convolution matrix
●
Receptive field – filter size
●
Depth – number of filters
●
Space invariant
●
Pooling layer
●
Combine cluster of neurons into one
●
Non-linear down-sampling
21. Convolutional networks
●
Fully connected layer
●
Dense
●
Just like in multi-layer perceptron
●
Activation function
●
Rectifier – linear but remove negative values
●
Trains faster and reduces the vanishing gradient problem
●
Output activation function
●
Softmax - single-class
●
Sigmoid - multi-class
26. Recurrent networks
●
Multi-layer perceptron with back-connections
●
Topology is a directed graph
●
Internal state – memory
●
Variable length sequence with dependencies within
●
Training
●
Backpropagation through time
●
Vanishing gradient problem reduction via gated state
●
Long short-term memory (1997)
●
Gated recurrent unit (2014)
30. Materials
●
Deep Learning @ MIT Press
●
Neural Networks and Deep Learning @ Michael Nielsen
●
Practical Deep Learning @ Coursera
●
Deep Learning Specialization @ Coursera
●
Deep Learning Courses @ edX