Speaker: Deping Huang
09/29/2017
1
2
3
Introduction to Neural Network and
Backpropagation algorithm
Dealing with MNIST datasets
and digit recognition
Brief History of Artififcal Intelligence
and Introduction to Machine Learning
A Simple Example
Models and Methods
Research Background
Outline
Research Background
Fig. AlphaGo VS Lee Sedol Fig. AlphaGo VS Ke Jie
Research Background
Fig. Professor Feifei Li Fig. Datasets of IMAGENET.
14,197,122 images, 21841 synsets indexed
Research Background
Fig. Images that combine
the content of a photograph
with the style of several well-
known artworks
2014, ArXiv, A Neural Algorithm of Artistic Style
Machine Learning
supervised learning unsupervised learning
Reinforcement learning
Some Machine Learning Methods
Support Vector Machine Neural NetworkRestricted Bolzman
Machine
What is Neural Network?
Fig. What is Neural
Network?
Structure of Neural Network
Fig. Model of Neural Network.
b: biases
w: weights
z: activation inputs,
y: activations (also denoted
by a), sigmoid function
Training: From Data
to Parameters
,"bird"
,"bird"
,"bird"
,"cat"
,"cat"
,"cat"
,"dog"
,"dog"
,"dog"
N
e
t
w
o
r
k
"bird"Neural networks get parameters from huge amouts
of data, it is called training!
Cost Function
Fig. Cost function
Why 1/2 ?
"cat"
y =a, c=0
y!=a, c=2
1/2 is a normalized factor
"bird"
"dog"
Gradient Descent Method
Fig. Gradient Descent Method
η: learning rate, Hyperparameter
Fig. 2D surface and gradient descent method
Stochastic Gradient Descent
Method
Fig. Stochastic Gradient Descent
Methods
Generally, m is far smaller than n.
Using stochastic gradient descent
method, we can get results much
faster with a little loss of accuracy
We shuffle the data
and split it to many
pieces with size m.
Calculate Gradients
Fig. Calculate the gradient
with chain-rule
Fig. Two-layer network
Fig. Calculate the gradients with
the definition of partial deriviate
It is too complicated using chain-rule
to calculate gradients!
We should calculate the cost for
every parameter which is too
time-costed!
Calculate Gradients
Backpropagation Algorithm
Fig. Four Formulas of BP algorithm.
δ: errors of each layer
Hadamard Product
errors of each layer
2016, Michael Nielsen, Introduction to Neural Networks and Deep Learning
Backpropagation Algorithm
Fig. Prove of the formulas of
backpropagation algorithm.
It is the application of chain-rule in
calculas
BP1
BP4
BP3
BP2
Algorithm Flowchart
Fig. Algorithm flowchart of the
training of neural networks
1. we should design the network topology.
How many neurons each layer ?
How many layers?
2. Data(include input and output) should
be given.
3. update the parameters with SGD
algorithm.
Introduction to Convolutional
Neural Network
Fig. Structure of convolution
neural network
Local Receptive Field
Introduction to Convolutional
Neural Network
Fig. Structure of convolution
neural network
Local Receptive Field
Fig. output of the neuron in the
convolutional layer
Shared wight and
feature map
Max-pooling
Fig. Max-pool layer
Application of CNN
Fig. Structure of AlexNet
Datasets: 1.2 million
images with the size of
224x224;
150,000 images used for
testing
2012, Alex Krizhevsky,*
Over 60 million parameters
Accuracy: 84.7%
Digit Recognition: MNIST Datasets
Fig. MNIST Datasets.
MNIST: Modified National Institute
of Standards and Technology database
Fig. Network Structure
Digit Recognition: MNIST Datasets
Fig. Curve of accuracy.
numbers of nodes of the hidden
layer are: 5, 30 ,60
Fig. Curve of cost fuction.
numbers of nodes of the hidden layer
are: 5, 30 ,60
Conclutions
1. We introduce the training method of
neural network: BP algorithm
2. We introduce Convolutional Neural
Network.
3. Some simple results about digit
recogntion.
Then.....
What Can We Do With
Neural Network?
THANKS

Introduction To Machine Learning and Neural Networks

  • 1.
  • 2.
    1 2 3 Introduction to NeuralNetwork and Backpropagation algorithm Dealing with MNIST datasets and digit recognition Brief History of Artififcal Intelligence and Introduction to Machine Learning A Simple Example Models and Methods Research Background Outline
  • 3.
    Research Background Fig. AlphaGoVS Lee Sedol Fig. AlphaGo VS Ke Jie
  • 4.
    Research Background Fig. ProfessorFeifei Li Fig. Datasets of IMAGENET. 14,197,122 images, 21841 synsets indexed
  • 5.
    Research Background Fig. Imagesthat combine the content of a photograph with the style of several well- known artworks 2014, ArXiv, A Neural Algorithm of Artistic Style
  • 6.
    Machine Learning supervised learningunsupervised learning Reinforcement learning
  • 7.
    Some Machine LearningMethods Support Vector Machine Neural NetworkRestricted Bolzman Machine
  • 8.
    What is NeuralNetwork? Fig. What is Neural Network?
  • 9.
    Structure of NeuralNetwork Fig. Model of Neural Network. b: biases w: weights z: activation inputs, y: activations (also denoted by a), sigmoid function
  • 10.
    Training: From Data toParameters ,"bird" ,"bird" ,"bird" ,"cat" ,"cat" ,"cat" ,"dog" ,"dog" ,"dog" N e t w o r k "bird"Neural networks get parameters from huge amouts of data, it is called training!
  • 11.
    Cost Function Fig. Costfunction Why 1/2 ? "cat" y =a, c=0 y!=a, c=2 1/2 is a normalized factor "bird" "dog"
  • 12.
    Gradient Descent Method Fig.Gradient Descent Method η: learning rate, Hyperparameter Fig. 2D surface and gradient descent method
  • 13.
    Stochastic Gradient Descent Method Fig.Stochastic Gradient Descent Methods Generally, m is far smaller than n. Using stochastic gradient descent method, we can get results much faster with a little loss of accuracy We shuffle the data and split it to many pieces with size m.
  • 14.
    Calculate Gradients Fig. Calculatethe gradient with chain-rule Fig. Two-layer network
  • 15.
    Fig. Calculate thegradients with the definition of partial deriviate It is too complicated using chain-rule to calculate gradients! We should calculate the cost for every parameter which is too time-costed! Calculate Gradients
  • 16.
    Backpropagation Algorithm Fig. FourFormulas of BP algorithm. δ: errors of each layer Hadamard Product errors of each layer 2016, Michael Nielsen, Introduction to Neural Networks and Deep Learning
  • 17.
    Backpropagation Algorithm Fig. Proveof the formulas of backpropagation algorithm. It is the application of chain-rule in calculas BP1 BP4 BP3 BP2
  • 18.
    Algorithm Flowchart Fig. Algorithmflowchart of the training of neural networks 1. we should design the network topology. How many neurons each layer ? How many layers? 2. Data(include input and output) should be given. 3. update the parameters with SGD algorithm.
  • 19.
    Introduction to Convolutional NeuralNetwork Fig. Structure of convolution neural network Local Receptive Field
  • 20.
    Introduction to Convolutional NeuralNetwork Fig. Structure of convolution neural network Local Receptive Field Fig. output of the neuron in the convolutional layer Shared wight and feature map
  • 21.
  • 22.
    Application of CNN Fig.Structure of AlexNet Datasets: 1.2 million images with the size of 224x224; 150,000 images used for testing 2012, Alex Krizhevsky,* Over 60 million parameters Accuracy: 84.7%
  • 23.
    Digit Recognition: MNISTDatasets Fig. MNIST Datasets. MNIST: Modified National Institute of Standards and Technology database Fig. Network Structure
  • 24.
    Digit Recognition: MNISTDatasets Fig. Curve of accuracy. numbers of nodes of the hidden layer are: 5, 30 ,60 Fig. Curve of cost fuction. numbers of nodes of the hidden layer are: 5, 30 ,60
  • 25.
    Conclutions 1. We introducethe training method of neural network: BP algorithm 2. We introduce Convolutional Neural Network. 3. Some simple results about digit recogntion.
  • 26.
    Then..... What Can WeDo With Neural Network?
  • 27.