Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tutorial: Using Convolutional Neural Networks to Detect Object Keypoints

20,226 views

Published on

Using our tutorial, you will learn how to get a convolutional neural network model for the detection of object keypoints in images. For our tutorial we’ve chosen license plates as the object to be detected in images. However this tutorial can be used as a guide for detecting other objects.

This tutorial is an addition to the publication "Convolutional Neural Networks for Object Detection" at our website: http://rnd.azoft.com/convolutional-neural-networks-for-object-detection/

Published in: Science
  • great work , can you post the documentation about experimental results please and the resource which depend on it to argue your work..
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Tutorial: Using Convolutional Neural Networks to Detect Object Keypoints

  1. 1. rnd.azoft.com Tutorial: Using Convolutional Neural Networks to Detect Object Keypoints
  2. 2. Using our tutorial, you will learn how to get a convolutional neural network model for the detection of object keypoints in images. For our tutorial we’ve chosen license plates as the object to be detected in images. However this tutorial can be used as a guide for detecting other objects. About the project 1. Choosing images for neural network training 2. Labeling the keypoints 3. Data augmentation 4. Packing a dataset in HDF5 5. Training a convolutional neural network 6. Conclusion 7. Related links Overview rnd.azoft.com
  3. 3. rnd.azoft.com 1. Choosing images for neural network training 1. Choosing images for neural network training The training dataset has to have somewhere from a few hundred to a few thousand original (not augmented) images in total. The more, the better.
  4. 4. 2. Labeling the keypoints 2. Labeling the keypoints If the keypoints in the original image were not labeled, then you need to label them. This means you need to save keypoint coordinates in **.txt or **.csv file format. Each coordinate has two values for the horizontal axis and for the vertical axis. Remark: If you decided to label several keypoints, then you should label them in one sequence. For example - labeling a license plate: you might label the left side upper plate’s angle by the first dot, the right side upper plate’s angle by the second dot, the left side lower plate’s angle by the third dotand the right side lower plate’s angle by the fourth dot. So in future, you should keep the same sequence. rnd.azoft.com
  5. 5. rnd.azoft.com 3. Data augmentation 3. Data augmentation For effective training you need to get a dataset with several thousand to tens of thousands of images. If the initial dataset is not enough, you should apply augmentation of the images. Remark: Before starting with augmentation split your database into training and control parts. This is required to guarantee that images received by augmentation of one picture will be in the training as well as in the control part. If you miss this step, you can barely follow the retraining of the model.
  6. 6. rnd.azoft.com 3. Data augmentation Here are the transformations that can be implemented for the augmentation: ● Rotations relative to the center ● Perspective distortion ● Resize ● Shifts ● Salt-and-pepper noise ● Blurring and sharpening ● Erosion and dilation
  7. 7. rnd.azoft.com 4. Packing a dataset in HDF5 4. Packing a dataset in HDF5 In order to use the Caffe framework, you need to pack the dataset into the file format HDF5. You should normalize pixel values from 0 to 1 and coordinate values from -1 to 1. Remark: ● If you implemented augmentation for the initial images, the images in HDF5 have to follow in random order. ● After packing the dataset in HDF5, you should check the received file using the utility HDF5 Viewer. The data of pixels have to be from 0 to 1, whereas coordinates have to be from -1 to 1, and images must not be distorted.
  8. 8. rnd.azoft.com 4. Packing a dataset in HDF5
  9. 9. rnd.azoft.com 4. Packing a dataset in HDF5
  10. 10. rnd.azoft.com 5. Training the convolutional neural network 5. Training the convolutional neural network We recommend using the optimization method ADAM to begin training a neural network. The input layer should look like this: layer { name: "data" type: "HDF5Data" top: "data" top: "label" hdf5_data_param { source: "/home/user/caffe/examples/regression/regression_train.txt" batch_size: 256 } }
  11. 11. rnd.azoft.com 5. Training the convolutional neural network The number of outputs at the output layer have to be equal to the number of coordinate values. It’s better to use the layer of error EuclideanLoss. layer { name: "ipout" type: "InnerProduct" bottom: "ip01" top: "ipout" inner_product_param { num_output: 8 weight_filler { type: "msra" } bias_filler { type: "constant" } } } layer { name: "loss" type: "EuclideanLoss" bottom: "ipout" bottom: "label" top: "loss" }
  12. 12. It seems quite complicated to train a qualified and quick neural network with just several attempts. We made about 20 trials before we got the appropriate outcome. If you have some questions regarding the idea, the experiment implementation, or the code, we’ll be glad to answer you in comments below. Conclusion We have used these works as the base of the experiment: 1. Using convolutional neural nets to detect facial keypoints 2. Сaffe-regression examples ,Kaggle face keypoint detection Related links rnd.azoft.com Read the Detailed Convolutional Neural Networks for Object Detection Project Overview

×