WHAT IS CNN?
• a class of feed forward artificial neural networks that are applied to analyzing
visual imagery.
• CNN is designed to automatically and adaptively learn spatial hierarchies of
features through backpropagation by using multiple building blocks, such as
convolution layers, pooling layers, and fully connected layers.
• It’s neural networks that use convolution instead of general matrix multiplication
in at least one of their layers
• BY USING CONVOLUTION:
WE CAN REPRESENT THE INPUT SIGNAL WITH A NEW
REPRESENTATION THE MOST IMPORTANT CHARACTERISTIC OF GIVEN SIGNAL
BECOME VERY OBVIOUS
IN THIS NEW REPRESENTATION
“MOTIVATION”
DETECTION AND CLASSIFICATIONS TASKS
CV
NLP(TEXT)
AUDIO
CNN ARCHITECTURE PARTS:
• CONVOLUTIONAL BASE:
IS COMPOSED BY :A STACK OF CONVOLUTIONAL
AND POOLING LAYERS.
THE MAIN GOAL : TO GENERATE FEATURES FROM
THE IMAGE.
• CLASSIFIER:
IS COMPOSED BY: USUALLY FULLY CONNECTED
LAYERS.
THE MAIN GOAL :TO CLASSIFY THE IMAGE BASED
ON THE DETECTED FEATURES.
WHY CNN?
SPARSE
CONNECTIVITY
•
PARAMETE
R
SHARING
TRANSFER
LEARNING
* DEEP NETWORKS HAVE A LARGE NUMBER OF UNKNOWN
PARAMETERS ( IN MILLIONS ).
THE TASK OF TRAINING A NETWORK IS TO FIND THE
OPTIMUM PARAMETERS USING THE TRAINING DATA
* IF WE HAVE VERY FEW DATA, WE WILL GET ONLY
APPROXIMATE VALUES FOR MOST OF THE PARAMETERS,
WHICH WE DON’T WANT
For Deep Networks – More data -> Better
learning
Learning:
• IT IS DIFFICULT TO GET SUCH HUGE LABELED
DATASETS FOR TRAINING THE NETWORK.
• EVEN IF YOU GET THE DATA, IT TAKES A LARGE
AMOUNT OF TIME TO TRAIN THE NETWORK (
HUNDREDS OF HOURS ). THUS, IT TAKES A LOT OF
TIME, MONEY AND EFFORT TO TRAIN A DEEP
NETWORK SUCCESSFULLY.
The
problem
Solution
Transfer learning
Transfer of learning is the ability to
apply
knowledge learned in one context to
new contexts.
Variations:
● Same domain, different task
● Different domain, same task
• instead of starting learning process on model from scratch (with
random initialization), start from network that have been
learned when solving a different problem task.
• HOW TO REPURPOSING A PRE-TRAINED MODEL FOR YOUR OWN
NEEDS?
1) start by removing the original classifier(“off-the-shelf”)
2) then add a new classifier that fits your purposes.
3) finally train a new model on off-the-shelf features according to
one of three strategies( fine-tune )
Transfer
learning : idea
FINE-TUNING: SUPERVISED TASK
ADAPTATION
FINE-TUNE NETWORK USING
BACKPROP WITH LABELS FOR
TARGET DOMAIN UNTIL
VALIDATION LOSS STARTS TO
INCREASE
TRANSFER LEARNING SCENARIOS:
SIMILARITY MATRIX
CLASSIFY YOUR PROBLEM
ACCORDING TO THE SIZE OF DATA
this matrix classifies your computer vision problem considering the
size of your dataset and its similarity to the dataset in which your pre-
trained model was trained.
KERAS PRE-
TRAINED
IMAGENET
MODELS
WHAT IS IMAGENET
IMAGENET is a project which aims to provide a large image database for research
purposes. it contains more than 14 million images which belong to more than
20,000 classes ( or synsets ). they also provide bounding box annotations for
around 1 million images, which can be used in object localization tasks.
KERAS PRETRAINED MODELS
• keras comes bundled with many models.
• a trained model has two parts – model architecture and model weights. the
weights are large files and thus they are not bundled with keras.
MODELS:
• VGG16
• INCEPTIONV3
• RESNET50
• MOBILENET
• INCEPTIONRESNETV2
FUTURE WORK
• IN DEEP LEARNING:
While not perfect, such an approach works indeed. There is, though, lots of
improvements to be made:
* COLLECT MORE DATA: convnets require lots of data, for good
performance.
* SUITABLE LOSS FUNCTION: for simplicity used categorical cross-entropy
loss that works out- of-box.
will switch to a more suitable loss function, the one that better leverages
intra-class variance. good option to start with might have been a triplet loss
(see facenet paper for details).
* BETTER OVERALL CLASSIFIER ARCHITECTURE:
the current classifier is basically a prototype which goal is to switch to r-
cnn architecture.

Dl

  • 1.
    WHAT IS CNN? •a class of feed forward artificial neural networks that are applied to analyzing visual imagery. • CNN is designed to automatically and adaptively learn spatial hierarchies of features through backpropagation by using multiple building blocks, such as convolution layers, pooling layers, and fully connected layers. • It’s neural networks that use convolution instead of general matrix multiplication in at least one of their layers • BY USING CONVOLUTION: WE CAN REPRESENT THE INPUT SIGNAL WITH A NEW REPRESENTATION THE MOST IMPORTANT CHARACTERISTIC OF GIVEN SIGNAL BECOME VERY OBVIOUS IN THIS NEW REPRESENTATION
  • 2.
  • 3.
    DETECTION AND CLASSIFICATIONSTASKS CV NLP(TEXT) AUDIO
  • 4.
    CNN ARCHITECTURE PARTS: •CONVOLUTIONAL BASE: IS COMPOSED BY :A STACK OF CONVOLUTIONAL AND POOLING LAYERS. THE MAIN GOAL : TO GENERATE FEATURES FROM THE IMAGE. • CLASSIFIER: IS COMPOSED BY: USUALLY FULLY CONNECTED LAYERS. THE MAIN GOAL :TO CLASSIFY THE IMAGE BASED ON THE DETECTED FEATURES.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    * DEEP NETWORKSHAVE A LARGE NUMBER OF UNKNOWN PARAMETERS ( IN MILLIONS ). THE TASK OF TRAINING A NETWORK IS TO FIND THE OPTIMUM PARAMETERS USING THE TRAINING DATA * IF WE HAVE VERY FEW DATA, WE WILL GET ONLY APPROXIMATE VALUES FOR MOST OF THE PARAMETERS, WHICH WE DON’T WANT For Deep Networks – More data -> Better learning Learning:
  • 10.
    • IT ISDIFFICULT TO GET SUCH HUGE LABELED DATASETS FOR TRAINING THE NETWORK. • EVEN IF YOU GET THE DATA, IT TAKES A LARGE AMOUNT OF TIME TO TRAIN THE NETWORK ( HUNDREDS OF HOURS ). THUS, IT TAKES A LOT OF TIME, MONEY AND EFFORT TO TRAIN A DEEP NETWORK SUCCESSFULLY. The problem
  • 11.
    Solution Transfer learning Transfer oflearning is the ability to apply knowledge learned in one context to new contexts. Variations: ● Same domain, different task ● Different domain, same task
  • 12.
    • instead ofstarting learning process on model from scratch (with random initialization), start from network that have been learned when solving a different problem task. • HOW TO REPURPOSING A PRE-TRAINED MODEL FOR YOUR OWN NEEDS? 1) start by removing the original classifier(“off-the-shelf”) 2) then add a new classifier that fits your purposes. 3) finally train a new model on off-the-shelf features according to one of three strategies( fine-tune ) Transfer learning : idea
  • 14.
    FINE-TUNING: SUPERVISED TASK ADAPTATION FINE-TUNENETWORK USING BACKPROP WITH LABELS FOR TARGET DOMAIN UNTIL VALIDATION LOSS STARTS TO INCREASE
  • 15.
  • 16.
    SIMILARITY MATRIX CLASSIFY YOURPROBLEM ACCORDING TO THE SIZE OF DATA this matrix classifies your computer vision problem considering the size of your dataset and its similarity to the dataset in which your pre- trained model was trained.
  • 18.
  • 19.
    WHAT IS IMAGENET IMAGENETis a project which aims to provide a large image database for research purposes. it contains more than 14 million images which belong to more than 20,000 classes ( or synsets ). they also provide bounding box annotations for around 1 million images, which can be used in object localization tasks.
  • 20.
    KERAS PRETRAINED MODELS •keras comes bundled with many models. • a trained model has two parts – model architecture and model weights. the weights are large files and thus they are not bundled with keras. MODELS: • VGG16 • INCEPTIONV3 • RESNET50 • MOBILENET • INCEPTIONRESNETV2
  • 21.
    FUTURE WORK • INDEEP LEARNING: While not perfect, such an approach works indeed. There is, though, lots of improvements to be made: * COLLECT MORE DATA: convnets require lots of data, for good performance. * SUITABLE LOSS FUNCTION: for simplicity used categorical cross-entropy loss that works out- of-box. will switch to a more suitable loss function, the one that better leverages intra-class variance. good option to start with might have been a triplet loss (see facenet paper for details). * BETTER OVERALL CLASSIFIER ARCHITECTURE: the current classifier is basically a prototype which goal is to switch to r- cnn architecture.