Lesson 6
Deep Learning
Legal Notices and Disclaimers
This presentation is for informational purposes only. INTEL MAKES NO
WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Intel technologies’ features and benefits depend on system configuration and may
require enabled hardware, software or service activation. Performance varies
depending on system configuration. Check with your system manufacturer or retailer
or learn more at intel.com.
This sample source code is released under the Intel Sample Source Code License
Agreement.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other
countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2018, Intel Corporation. All rights reserved.
3
Learning Objectives
• Identify the types of problems Deep Learning resolves
• Describe the steps in building a neural network model
• Describe a convolutional neural network
• Explain transfer learning and why it’s useful
• Identify common Deep Learning architectures
You will be able to:
5
Deep Learning
“Machine learning that involves using
very complicated models called “deep
neural networks”." (Intel)
Models determine best representation
of original data; in classic machine
learning, humans must do this.
6
Classic
Machine
Learning
Step 1: Determine
features.
Step 2: Feed them
through model.
“Arjun"
Neural Network
“Arjun"
Deep
Learning
Steps 1 and 2
are combined
into 1 step.
Deep Learning Differences
Machine
Learning
Classifier
Algorithm
Feature
Detection
7
Deep Learning Problem Types
Deep Learning can solve multiple supervised and unsupervised problems.
• The majority of its success has been when working
with images, natural language, and audio data.
• Image classification and detection.
• Semantic segmentation.
• Natural language object retrieval.
• Speech recognition and language translation.
8
Classification and Detection
Detect and label the image
• Person
• Motor Bike
https://people.eecs.berkeley.edu/~jhoffman/talks/lsda-baylearn2014.pdf
9
Semantic Segmentation
Label every pixel
https://people.eecs.berkeley.edu/~jhoffman/talks/lsda-baylearn2014.pdf
10
Natural Language Object Retrieval
http://arxiv.org/pdf/1511.04164v3.pdf
11
Speech Recognition and Language Translation
http://svail.github.io/mandarin/
The same architecture can be used for speech recognition in English, or
in Mandarin Chinese.
Formulating Supervised Learning Tools
13
For a supervised learning problem:
• Collect a labeled dataset (features and target labels).
• Choose the model.
• Choose an evaluation metric:
“What to use to measure performance.”
• Choose an optimization method:1
“How to find the model configuration that gives the best performance.”
1 There are standard methods to use for different models and metrics.
Which Model?
14
There are many models that represent the problem and make decisions in different
ways, each with their own advantages and disadvantages.
• DL models are biologically inspired.
• The main building block is a neuron.
Neuron
Neuron
15
A neuron multiplies each feature by a weight and then adds these
values together.
• Z = X1W1+ X2W2+ X3W3
Z
X1
X2
X3
W2
W1
W3
Neuron
16
This value is then put through a function called the activation function.
• There are several activation functions
that can be used.
• The output of the neuron is the output
of the activation function.
Activation
Function
X1
X2
X3
W2
W1
W3
Output
Perceptron
17
A neuron with a simple activation function can solve linearly
separable problems.
• These are problems where the different
classes can be separated by a line.
• The Perceptron: one of the earliest neural
network models that used neurons with
simple activation functions.
Perceptron
18
Problems where the labels cannot be separated by a single line are not
solvable by a single neuron.
• This is a major limitation, and one of the
reasons for the first AI winter, that was
discussed in lesson 1.
19
Fully Connected Network
http://svail.github.io/mandarin/
More complicated problems can be solved by connecting multiple
neurons together and using more complicated activation functions.
• Organized into layers of neurons.
• Each neuron is connected to every
neuron in the previous layer.
• Each layer transforms the output of
the previous layer and then passes it
on to the next.
• Every connection has a separate weight.
Deep Learning
Deep Learning refers to when many layers are used to build deep networks.
• State-of-the-art models use hundreds of layers.
• Deep layers tend to decrease in width.
• Successive layers transform inputs with two effects:
• Compression: each layer is asked to summarize the
input in a way that best serves the task.
• Extraction: the model succeeds when each layer
extracts task-relevant information.
Input
Output
21
Steps in Building a Fully Connected Network
To build a fully connected network a user needs to:
• Define the network architecture.
• How many layers and neurons?
• Define what activation function to
use for each neuron.
• Define an evaluation metric.
• The values for the weights are learned
during model training.
http://svail.github.io/mandarin/
22
Evaluation Metric
The metric used will depend on the problem being solved.
Some examples include:
• Regression
• Mean Squared Error
• Classification
• Categorical Cross-Entropy
• Multi-Label classification
• Binary Cross-Entropy
23
Fully Connected Network Problems
Not optimal for detecting features.
• Computationally intensive – heavy memory usage.
25
Convolutional Neural Network
Convolutional neural networks reduce the required
computation and are good for detecting features.
• Each neuron is connected to a small set of nearby
neurons in the previous layer.
• The same set of weights are used for each neuron.
• Ideal for spatial feature recognition.
• Example: image recognition
• Cheaper on resources due to fewer connections.
http://svail.github.io/mandarin/
26
Convolutions as Feature Detectors
-1 1 -1
-1 1 -1
-1 1 -1
Vertical Line Detector
-1 -1 -1
1 1 1
-1 -1 -1
Horizontal Line Detector
-1 -1 -1
-1 1 1
-1 1 1
Corner Detector
Convolutions can be thought of as “local feature detectors”.
27
Convolutions
28
Convolutional Neural Network
Convolutions
Fully Connected
30
Transfer Learning
There are difficulties with building convolutional neural networks.
• They require huge datasets.
• There is a large amount of required computation.
• A large amount of time is spent experimenting to
get the hyper-parameters correct.
• It would be very difficult to train a famous,
competition-winning model from scratch
31
Transfer Learning
We can take advantage of existing DL models.
• Early layers in a neural network are the hardest (slowest) to train.
• These ”primitive” features should be general across many image
classification tasks.
• Later layers in the network capture features that are more particular to
the specific image classification problem.
• Later layers are easier (quicker) to train since adjusting their weights has
a more immediate impact on the final result.
32
Transfer Learning
Keeping the early layers of a pre-trained network, and re-training the
later layers for a specific application, is called transfer learning.
• Results of the training are weights (numbers) that are easy to store.
• Using pre-trained layers reduces the amount of required data.
33
Convolutional Neural Network
Convolutions
Fully Connected
Fix Weights
Train only the
last layer
34
Transfer Learning Options
The additional training of a pre-trained network on a specific new
dataset is referred to as “fine-tuning”.
• Choose “how much” and “how far back” to fine-tune.
• Should I train just the very last layer, or go back a few layers?
• Re-train the entire network (from the starting-point of the existing
network)?
36
AlexNet
37
VGG 16
38
Inception
39
MobileNets
Efficient models for mobile and embedded vision applications.
MobileNet models can be applied to various recognition tasks for efficient device intelligence.
40
Learning Objectives
• Identify the types of problems Deep Learning resolves.
• Describe the steps in building a neural network model.
• Describe a convolutional neural network.
• Explain transfer learning and why it’s useful.
• Identify common Deep Learning architectures.
In this session we worked to:
Deep learning summary

Deep learning summary

  • 1.
  • 2.
    Legal Notices andDisclaimers This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at intel.com. This sample source code is released under the Intel Sample Source Code License Agreement. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 2018, Intel Corporation. All rights reserved.
  • 3.
    3 Learning Objectives • Identifythe types of problems Deep Learning resolves • Describe the steps in building a neural network model • Describe a convolutional neural network • Explain transfer learning and why it’s useful • Identify common Deep Learning architectures You will be able to:
  • 5.
    5 Deep Learning “Machine learningthat involves using very complicated models called “deep neural networks”." (Intel) Models determine best representation of original data; in classic machine learning, humans must do this.
  • 6.
    6 Classic Machine Learning Step 1: Determine features. Step2: Feed them through model. “Arjun" Neural Network “Arjun" Deep Learning Steps 1 and 2 are combined into 1 step. Deep Learning Differences Machine Learning Classifier Algorithm Feature Detection
  • 7.
    7 Deep Learning ProblemTypes Deep Learning can solve multiple supervised and unsupervised problems. • The majority of its success has been when working with images, natural language, and audio data. • Image classification and detection. • Semantic segmentation. • Natural language object retrieval. • Speech recognition and language translation.
  • 8.
    8 Classification and Detection Detectand label the image • Person • Motor Bike https://people.eecs.berkeley.edu/~jhoffman/talks/lsda-baylearn2014.pdf
  • 9.
    9 Semantic Segmentation Label everypixel https://people.eecs.berkeley.edu/~jhoffman/talks/lsda-baylearn2014.pdf
  • 10.
    10 Natural Language ObjectRetrieval http://arxiv.org/pdf/1511.04164v3.pdf
  • 11.
    11 Speech Recognition andLanguage Translation http://svail.github.io/mandarin/ The same architecture can be used for speech recognition in English, or in Mandarin Chinese.
  • 13.
    Formulating Supervised LearningTools 13 For a supervised learning problem: • Collect a labeled dataset (features and target labels). • Choose the model. • Choose an evaluation metric: “What to use to measure performance.” • Choose an optimization method:1 “How to find the model configuration that gives the best performance.” 1 There are standard methods to use for different models and metrics.
  • 14.
    Which Model? 14 There aremany models that represent the problem and make decisions in different ways, each with their own advantages and disadvantages. • DL models are biologically inspired. • The main building block is a neuron. Neuron
  • 15.
    Neuron 15 A neuron multiplieseach feature by a weight and then adds these values together. • Z = X1W1+ X2W2+ X3W3 Z X1 X2 X3 W2 W1 W3
  • 16.
    Neuron 16 This value isthen put through a function called the activation function. • There are several activation functions that can be used. • The output of the neuron is the output of the activation function. Activation Function X1 X2 X3 W2 W1 W3 Output
  • 17.
    Perceptron 17 A neuron witha simple activation function can solve linearly separable problems. • These are problems where the different classes can be separated by a line. • The Perceptron: one of the earliest neural network models that used neurons with simple activation functions.
  • 18.
    Perceptron 18 Problems where thelabels cannot be separated by a single line are not solvable by a single neuron. • This is a major limitation, and one of the reasons for the first AI winter, that was discussed in lesson 1.
  • 19.
    19 Fully Connected Network http://svail.github.io/mandarin/ Morecomplicated problems can be solved by connecting multiple neurons together and using more complicated activation functions. • Organized into layers of neurons. • Each neuron is connected to every neuron in the previous layer. • Each layer transforms the output of the previous layer and then passes it on to the next. • Every connection has a separate weight.
  • 20.
    Deep Learning Deep Learningrefers to when many layers are used to build deep networks. • State-of-the-art models use hundreds of layers. • Deep layers tend to decrease in width. • Successive layers transform inputs with two effects: • Compression: each layer is asked to summarize the input in a way that best serves the task. • Extraction: the model succeeds when each layer extracts task-relevant information. Input Output
  • 21.
    21 Steps in Buildinga Fully Connected Network To build a fully connected network a user needs to: • Define the network architecture. • How many layers and neurons? • Define what activation function to use for each neuron. • Define an evaluation metric. • The values for the weights are learned during model training. http://svail.github.io/mandarin/
  • 22.
    22 Evaluation Metric The metricused will depend on the problem being solved. Some examples include: • Regression • Mean Squared Error • Classification • Categorical Cross-Entropy • Multi-Label classification • Binary Cross-Entropy
  • 23.
    23 Fully Connected NetworkProblems Not optimal for detecting features. • Computationally intensive – heavy memory usage.
  • 25.
    25 Convolutional Neural Network Convolutionalneural networks reduce the required computation and are good for detecting features. • Each neuron is connected to a small set of nearby neurons in the previous layer. • The same set of weights are used for each neuron. • Ideal for spatial feature recognition. • Example: image recognition • Cheaper on resources due to fewer connections. http://svail.github.io/mandarin/
  • 26.
    26 Convolutions as FeatureDetectors -1 1 -1 -1 1 -1 -1 1 -1 Vertical Line Detector -1 -1 -1 1 1 1 -1 -1 -1 Horizontal Line Detector -1 -1 -1 -1 1 1 -1 1 1 Corner Detector Convolutions can be thought of as “local feature detectors”.
  • 27.
  • 28.
  • 30.
    30 Transfer Learning There aredifficulties with building convolutional neural networks. • They require huge datasets. • There is a large amount of required computation. • A large amount of time is spent experimenting to get the hyper-parameters correct. • It would be very difficult to train a famous, competition-winning model from scratch
  • 31.
    31 Transfer Learning We cantake advantage of existing DL models. • Early layers in a neural network are the hardest (slowest) to train. • These ”primitive” features should be general across many image classification tasks. • Later layers in the network capture features that are more particular to the specific image classification problem. • Later layers are easier (quicker) to train since adjusting their weights has a more immediate impact on the final result.
  • 32.
    32 Transfer Learning Keeping theearly layers of a pre-trained network, and re-training the later layers for a specific application, is called transfer learning. • Results of the training are weights (numbers) that are easy to store. • Using pre-trained layers reduces the amount of required data.
  • 33.
    33 Convolutional Neural Network Convolutions FullyConnected Fix Weights Train only the last layer
  • 34.
    34 Transfer Learning Options Theadditional training of a pre-trained network on a specific new dataset is referred to as “fine-tuning”. • Choose “how much” and “how far back” to fine-tune. • Should I train just the very last layer, or go back a few layers? • Re-train the entire network (from the starting-point of the existing network)?
  • 36.
  • 37.
  • 38.
  • 39.
    39 MobileNets Efficient models formobile and embedded vision applications. MobileNet models can be applied to various recognition tasks for efficient device intelligence.
  • 40.
    40 Learning Objectives • Identifythe types of problems Deep Learning resolves. • Describe the steps in building a neural network model. • Describe a convolutional neural network. • Explain transfer learning and why it’s useful. • Identify common Deep Learning architectures. In this session we worked to:

Editor's Notes

  • #7 For some tasks, like image recognition, Deep Learning can eliminate feature extraction.
  • #11 localize a target object within a given image based on a natural language query of the object. Natural language object retrieval differs from text-based image retrieval task as it involves spatial information about objects within the scene and global scene context
  • #12 CTC – Connectionist Temporal Classification English – Non tonal language, pitch is not important. So the traditional speech detection systems which convert raw audio to frequency lose out on pitch and work for english. Mandarin – Tonal, so pitch is important. Deep speech does not convert to frequency, so using raw input sound, train the NN for both english and mandarin Traditional speech detection systems extract abstract units of sound (Phonemes), map every word in the vocabulary to a set of phonemes. Adapting this to madarin would require constructing a new lexicon. Deep speech – does not need a new lexicon since the R-CNNs maps variable length speech to characters can adapt equally well to both english and mandarin. Language modelling – how probable is a string of words in a given language. English – words are separated by space, mandarin not. Deep speech uses a character based segmentation. All in all, most significant change is extending the last layer to include more characters in the mandarin language and deep speech works equally well.
  • #40 We’ll explore this more in this lesson’s assignment.