SlideShare a Scribd company logo
Transfer learning,
Active learning
using tensorflow object detection
api
Tensorflow Object detection api
Google released the tensorflow-based object detection API in June 2017.
By offering models with various speeds and accuracy, the general public can easily detect objects.
This announcement describes only object detection.
The more you go back, the harder the detection will be.
The object detection API supports both box and mask types.
Object detection
Detection of various objects in a single box is generally called object detection.
Although not mentioned in this announcement, it is called object sementatin that detects an object according to a shape rather than a box shape.
There are various types of image detection.
image classification is simply a determination of what the image is.
You can also localize from the image classification to the exact location of the object
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
Loop
Optional
The most basic flow of the tensorflow object detection api.
All functions are provided to process the data to api, train this data, export the model to a usable form, and test this model.
You can also evaluate ongoing or completed models.
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
Make tfrecord
python generate_tfRecord.py 
--csv_input=data/train.csv 
--output_path=data/train.record
The data type for using object detection api is tfrecord.
The source for creating this tfrecord file is provided, below is how to use it.
Set the csv file containing the image information and the output directory.
Finally, we need four numeric values to tell where each object is located in the image.
xmin, ymin, xmax, ymax values. With this value, you can create a box in which the object is located.
There can be multiple objects in an image.
Make tfrecord
Image
Label
Xmin , Ymin
Xmax , Ymax
The data needed to create the tfrecord file is as follows:
First, you need the original image file.
Next, we need a label for each object in the image.
Make tfrecord
LabelImg
https://github.com/tzutalin/labelImg
It is difficult to manually determine the location point of an object.
There are various programs that do this automatically.
I used the most widely known labelimg.
Make tfrecord
When you use labelimg, an xml file is created that contains the label of each object and the values of xmin, ymin, xmax, and ymax.
Make tfrecord
If we have an image to train, xml, and a labelmap that stores the id for each class, we can generate a tfrecord file.
Make tfrecord tfgenerator ( custom )
The default code is train and test.
Since this is a troublesome task, I have created a program that will do this automatically.
When executed, each class is distributed according to the ratio of train, validate, and a log is generated.
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
Re train Transfer learning
python object_detection/train.py 
--logtostderr 
--pipeline_config_path=pipeline.config 
--train_dir=train
object detection api python base code related to train.
Set up the model you want to train, and set the training output directory.
There are many other options.
ssd_mobilenet_v1_coco stands for ssd_mobilenet, which trained Coco Dataset.
Pre-trained datasets include COCO, Kitti, and Open Images datasets.
Re train Object detection API model zoo
ssd_mobilenet_v1_coco
Dataset : COCO dataset, Kitti dataset, Open Images dataset.
It provides various pre-trained models for object detection of tensorflow.
Each model has different speed and accuracy.
The first part of the model name is the algorithm, and the second part is the data set.
Re train What you want to detect
COCO dataset Instance
If the class you want to detect is an instance already in coco, kitty, or open image dataset, you do not need to train.
Just use the built-in model and select it.
Re train What you want to detect
?
However, if you want to classify dinosaurs such as Tyrannosaurus and Triceratops as shown in the picture, we need to train on the above classes.
Here we have two choices.
Create models with completely new layers, or leverage existing models.
Re train Make own model
It is known that the depth of the neural network and the wider the layer, the higher the accuracy.
The picture is google inception v2 model.
Deep networks require exponential computing resources.
Re train Make own model
google TPU 2.0
45 TFLOPS
Titan v
15 TFLOPS
gtx 1080
9 TFLOPS
gtx 970
4 TFLOPS
This time I released tpu 3.0 on Google i / o 2018.
To simplify the quantification of the flops power of tpu 2.0 and other graphics cards: Perhaps the maximum gpu that a non-enterprise user can have is titan.
Google uses a tpu pot that contains 64 tpu. Google may take weeks or even months to do the work for an hour.
Teaching a dinosaur to a child can be quite challenging, but teaching a dinosaur to an adult is a lot easier.
Transfer learning is similar to teaching dinosaurs to adults.
But if you show dinosaur photos to an adult, you will see them as animals by looking at their eyes, legs and tails.
Is not that a kind of reptile based on various information such as skin texture? You will think.
If you show your baby a photo of a dinosaur, the baby probably has no idea.
Babies will not know what their eyes, tails, and feet are, and they will not even have the concept of point, line.
Re train Transfer learning
?
So we will use a method called transfer learning.
There are two people who do not know dinosaurs at all.
One is a very young baby and the other is an adult.
Re train Transfer learning
This photo is a visualization of the weights we have for each layer. Looking closer at the picture, the nearest layer to the input has a low-dimensional feature
such as a line, As you go to the output, you can see that it has more detailed high-dimensional features.
Since low dimensional features have any object, transfer learning takes it as it is.
Re train Transfer learning
replace custom class
For example, inception v2 model, we take only the weights of previous layers and change only the last fully connected layer to custom labels.
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
If you give the checkpoint file that you have trained as an input, you can use frozon_inference_graph.pd
The file is created.
We can do the prediction from anywhere with this frozon_inference_graph.pd.
Export
python object_detection/export_inference_graph.py 
--input_type=image_tensor 
--pipeline_config_path=pipeline.config 
--trained_checkpoint_prefix=train/model.ckpt-xxxxx 
--output_directory=output
train_checkpoint_prefix
output_directory
Once you train your model and reach your goal, we need to export this output to an executable file.
Below is the object detection default python export code.
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
Evaluate
python eval.py 
--logtostderr 
--pipeline_config_path=training/ssd_mobilenet_v1_coco.config 
--checkpoint_dir=training/ 
--eval_dir=eval/
tensorboard --logdir=./
We would like to quantitatively identify how good this model is during training, or after the train is over. The api to use is evaluate.
Below is the python evaluate code provided by the object detection api.
We can visually check the evaluate result value through the tensorboard.
Evaluate Using tensorboard
This is the main screen of the evaluated tensorboard.
Let's take a quick look at each one.
Evaluate Using tensorboard
First, you can see the loss value for classification or localization.
The lower the loss value, the better.
Evaluate Using tensorboard
Second, you can check the accuracy of each class.
For different reasons, each character does not have the same classification accuracy.
Evaluate Using tensorboard
Third, you can measure the accuracy of the entire validation data.
Accuracy measurement method is 0.5IoU.
Evaluate Using tensorboard
IoU is an abbreviation of intersection of union, which is the ratio of the sum of the overlapping areas of the actual ground truth box and the area of the prediction
truth box.
As you can see in the photo, the accuracy is not 100% even though the predicted value wraps around the actual value.
Evaluate Using tensorboard
Finally, on the tensorboard, we can see how the estimate of the image changes every evaluation step.
Tensorflow object detection helper tool
Make
tfrecord
Re train Export Test
Evaluate
Automation
The set of data creation, train, evaluate, and export described above must be done manually by default.
I created a program that automatically feels inconvenienced.
Tensorflow object detection helper tool
Since you have set all the settings to the default settings that you can run by default, you only need to select the model and enter the training step.
Tensorflow object detection helper tool
https://github.com/5taku/tensorflow_object_detection_helper_tool
You can check the log of the result value and the execution time of each process.
Tensorflow Object detection api
Make
tfrecord
Re train Export Test
Evaluate
Test
It is the code that loads the file generated by export and loads it into memory.
These images are detected using the loaded graph file.
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
Test cloud resource ( google cloud compute engine )
16 vCPU
60gb Ram
1 x NVIDIA Tesla P100
ubuntu 16.0.4
python 2.7.12
tensorflow 1.8.0
cuda 9.0
cudnn 7.1
All the tests related to the announcement were tested on the Google cloud virtual machine.
The specifications are as follows.
Test dataset
The dataset used in the test was a simpson dataset from kaggle.
Test dataset
name total bounding_box
Homer Simpson 2246 612
Ned Flanders 1454 595
Moe Szyslak 1452 215
Lisa Simpson 1354 562
Bart Simpson 1342 554
Marge Simpson 1291 557
Krusty The Clown 1206 226
Principal Skinner 1194 506
Charles Montgomery Burns 1193 650
Milhouse Van Houten 1079 210
Chief Wiggum 986 209
Abraham Grampa Simpson 913 595
Sideshow Bob 877 203
Apu Nahasapeemapetilon 623 206
Kent Brockman 498 213
Comic Book Guy 469 208
Edna Krabappel 457 212
Nelson Muntz 358 219
Each character is fully labeled. (300 to 2000)
Some of them have box data. (200 to 600)
The number of data is not constant for each character.
Train , Evaluate , Export
model = faster rcnn resnet 50
training step = 140,000
training : validate rate = 8 : 2
I have trained 140,000 faster rcnn resnet 50 pre-train models with my tool.
Performs evaluation every 10,000 times.
Total training time is approximately 6 hours and 30 minutes.
Test tensor board result
The overall iou value was 0.67, which did not perform well.
IoU for each character is also inferior in performance.
Test handmade check
I did not touch the figures well, so I decided to check the image by inserting the actual image.
Twenty images per character were predialized. There are a few errors that I see in my eyes :(
The criteria for each T F N are as follows. (If multiple labels are detected on one object, the highest accuracy is T.)
True ( T ) False ( F ) None ( N )
Improved accuracy
1. Change Model
2. Increase data
a. Data augmentation
b. Labelling
c. Active learning
3. etc….
The result of training 6,000 data for 140,000 times is not satisfactory.
How can I improve my accuracy?
There are many ways to increase accuracy.
Improved accuracy change model
Model name
Speed
(ms)
COCO
mAP[^1]
Size
Training
time
Outputs
ssd_mobilenet_v1_coco 30 21 86M 9m 44s Boxes
ssd_mobilenet_v2_coco 31 22 201M 11m 12s Boxes
ssd_inception_v2_coco 42 24 295M 8m 43s Boxes
faster_rcnn_inception_v2_coco 58 28 167M 4m 43s Boxes
faster_rcnn_resnet50_coco 89 30 405M 4m 28s Boxes
faster_rcnn_resnet50_lowproposals_coco 64 405M 4m 30s Boxes
rfcn_resnet101_coco 92 30 685M 6m 19s Boxes
faster_rcnn_resnet101_coco 106 32 624M 6m 13s Boxes
faster_rcnn_resnet101_lowproposals_coco 82 624M 6m 13s Boxes
faster_rcnn_inception_resnet_v2_atrous_coco 620 37 712M 18m 6s Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco 241 712M Boxes
faster_rcnn_nas 1833 43 1.2G 47m 49s Boxes
faster_rcnn_nas_lowproposals_coco 540 1.2G Boxes
In choosing a model, tensorflow already offers several models.
We only need to select the model considering the accuracy, prediction speed, and training speed.
In the tests I've done, resnet50, inception v2 has guaranteed the best performance.
Improved accuracy Data augmentation
original
size up
size down
brightness up
brightness down
Rotating, zooming in and out using a single image is a widely used method of increasing data.
However, if you change the size of the image, you have to change the box position as well, which will be quite cumbersome.
Improved accuracy labelling
Too Expensive COST!!!
Another way to increase data is to manually create one object box boundary.
Manual labeling results in many labeling costs.
Active learning
Introducing Active learning, which allows labeling of images at low cost.
The basic concept is to use the data obtained by predicting the candidate data with the model made by using a small amount of data, and to use the result as
the training data again. Of course, the predicted result is verified by the person or machine named oracle.
Active learning
Training
labeled
data
unlabeled
data pool
Prediction
Query
if detection label == image label
an oracle
Simpson dataset is an ideal data set for active learning because each unboxed image is labeled.
The resnet 50 algorithm is trained 20,000 times with 6,000 box data provided initially.
Then, 12,000 unboxed images are predicted, and if the prediction result matches the image label, it is moved to the train dataset.
Active learning
The number of images adopted in the training set is as follows.
At the first prediction, two times the image was adopted, and since then the number has been reduced, but it has been adopted as a steady train set.
ID NAME 1 step 2 step 3 step 4 step 5 step 6 step 7 step
1 homer_simpson 626 1830 1980 2018 2050 2061 2066
2 ned_flanders 607 1214 1335 1355 1381 1387 1388
3 moe_szyslak 215 692 945 1121 1170 1238 1246
4 lisa_simpson 578 1116 1217 1234 1271 1271 1271
5 bart_simpson 571 1133 1243 1256 1256 1265 1272
6 marge_simpson 568 1229 1244 1248 1248 1250 1250
7 krusty_the_clown 239 558 1002 1139 1145 1147 1147
8 principal_skinner 519 1054 1110 1114 1116 1119 1122
9 charles_montgomery_burns 663 1058 1085 1099 1104 1105 1107
10 milhouse_van_houten 223 776 897 925 952 954 965
11 chief_wiggum 206 561 788 837 858 863 867
12 abraham_grampa_simpson 608 844 875 878 878 878 878
13 sideshow_bob 202 217 420 651 733 760 772
14 apu_nahasapeemapetilon 223 537 570 581 588 588 590
15 kent_brockman 217 250 270 324 371 415 415
16 comic_book_guy 224 351 387 395 398 403 403
17 edna_krabappel 226 227 327 370 384 386 390
18 nelson_muntz 217 284 287 288 297 297 299
Total 6932 13931 15982 16833 17200 17387 17448
Increase rate 6999 2051 851 367 187 61
train , evaluate , export
The first and seventh data split ratios.
Because the number of images per character differs a lot, the number of training is uneven.
However, the number of training data in all classes increased by 30% ~ 300%.
train , evaluate , export
Because there is a process of predicting 11,000 images, it took 21 hours for a task that took 6 hours and 30 minutes to work with active learning.
Test tensor board result
When the localization loss value was learned from the existing 6900 chapters, it decreased from 0.115 to 0.06 after applying active learning.
Test tensor board result
Accuracy for each class has also gradually decreased.
In active learning, the initial graph value is suddenly suddenly the train data is suddenly increased, but it is not certain.
Test tensor board result
0.66
0.87
IOU accuracy for the entire class has been increased from 0.66 to 0.87.
Active learning
As data increases, the accuracy of each test case increases.
Accuracy increases rapidly in the early days and continues to increase in the future.
Extra page
Further analysis
Active learning
Data 17,448
0 to 140,000
0 to 140,000
active learning every 20,000
Data 6,932
0 to 140,000
The first picture is the result of learning 6,932 pieces of data 140,000 times.
The second picture is the result of learning 140,000 times while adding training data to active learning every 20,000 times.
The last picture is the result of learning 140,400 data from the beginning of 17,448 data from the last result of active learning.
Active learning
Data 17,448
0 to 140,000
0 to 140,000
active learning every 20,000
Data 6,932
0 to 140,000
Looking at the results, some characters seem to show little increase in accuracy.
This is because the distribution of the entire dataset per class is so small that even if active learning is used, the percentage of the character in the entire data
set is rather lower.
comic_book_guy : 3.23 %
edna_krabappel : 3.26 %
nelson_muntz : 3.13 %
comic_book_guy : 2.30 %
edna_krabappel : 2.23 %
nelson_muntz : 1.71 %
Active learning
For kent brockman, the accuracy is fairly high, even though the data rate is 2.37%.
Though the test data may be luckily well-detected, I think that the characteristic white hair represents different classes and salient features.
kent_brockman : 2.37 %
Thank you!
1. Tensorflow Object detection API
2. Transfer learning
3. Object detection API helper tool
4. Active learning ( with test result )
Today we looked at the entire Tensorflow object detection API.
I also had a brief introduction to the concepts of tranfer learning and active learning and the helper tool I created.
We have confirmed that the accuracy of actual data is increased by active learning with low labeling cost.
Thank you!
Q&A

More Related Content

What's hot

Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
Edureka!
 
Moving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNNMoving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNN
NITISHKUMAR1401
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Itachi SK
 
Cnn
CnnCnn
Image compression .
Image compression .Image compression .
Image compression .
Payal Vishwakarma
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
MojammilHusain
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
Sumeet Kakani
 
Human Action Recognition
Human Action RecognitionHuman Action Recognition
Human Action Recognition
NAVER Engineering
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
J On The Beach
 
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPTImage Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
Akshit Arora
 
Object detection
Object detectionObject detection
Object detection
Jksuryawanshi
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial Domain
DEEPASHRI HK
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Jörgen Sandig
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
홍배 김
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Introduction to computer graphics
Introduction to computer graphics Introduction to computer graphics
Introduction to computer graphics
Priyodarshini Dhar
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
Fellowship at Vodafone FutureLab
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Taegyun Jeon
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
Chaitanya Maddala
 

What's hot (20)

Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
 
Moving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNNMoving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNN
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Cnn
CnnCnn
Cnn
 
Image compression .
Image compression .Image compression .
Image compression .
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
Human Action Recognition
Human Action RecognitionHuman Action Recognition
Human Action Recognition
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
 
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPTImage Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
Image Segmentation using Otsu's Method - Computer Graphics (UCS505) Project PPT
 
Object detection
Object detectionObject detection
Object detection
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial Domain
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Introduction to computer graphics
Introduction to computer graphics Introduction to computer graphics
Introduction to computer graphics
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
 

Similar to Transfer learning, active learning using tensorflow object detection api

Ai use cases
Ai use casesAi use cases
Ai use cases
Sparsh Agarwal
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine Learning
Denise Nepraunig
 
maXbox starter75 object detection
maXbox starter75 object detectionmaXbox starter75 object detection
maXbox starter75 object detection
Max Kleiner
 
Python-oop
Python-oopPython-oop
Python-oop
RTS Tech
 
Object detection
Object detectionObject detection
Object detection
Somesh Vyas
 
How to implement artificial intelligence solutions
How to implement artificial intelligence solutionsHow to implement artificial intelligence solutions
How to implement artificial intelligence solutions
Carlos Toxtli
 
Object oriented approach in python programming
Object oriented approach in python programmingObject oriented approach in python programming
Object oriented approach in python programming
Srinivas Narasegouda
 
Neural network image recognition
Neural network image recognitionNeural network image recognition
Neural network image recognition
Oleksii Sekundant
 
The Ring programming language version 1.4.1 book - Part 29 of 31
The Ring programming language version 1.4.1 book - Part 29 of 31The Ring programming language version 1.4.1 book - Part 29 of 31
The Ring programming language version 1.4.1 book - Part 29 of 31
Mahmoud Samir Fayed
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET Journal
 
Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognition
ShandukaniVhulondo
 
GDE Lab 1 – Traffic Light Pg. 1 Lab 1 Traffic L.docx
GDE Lab 1 – Traffic Light  Pg. 1     Lab 1 Traffic L.docxGDE Lab 1 – Traffic Light  Pg. 1     Lab 1 Traffic L.docx
GDE Lab 1 – Traffic Light Pg. 1 Lab 1 Traffic L.docx
budbarber38650
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
Lviv Startup Club
 
The Ring programming language version 1.8 book - Part 7 of 202
The Ring programming language version 1.8 book - Part 7 of 202The Ring programming language version 1.8 book - Part 7 of 202
The Ring programming language version 1.8 book - Part 7 of 202
Mahmoud Samir Fayed
 
C++ [ principles of object oriented programming ]
C++ [ principles of object oriented programming ]C++ [ principles of object oriented programming ]
C++ [ principles of object oriented programming ]
Rome468
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
Renjith M P
 
Tensor flow
Tensor flowTensor flow
Tensor flow
Nikhil Krishna Nair
 
Python programming computer science and engineering
Python programming computer science and engineeringPython programming computer science and engineering
Python programming computer science and engineering
IRAH34
 
07slide.ppt
07slide.ppt07slide.ppt
07slide.ppt
NuurAxmed2
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2
sadhana312471
 

Similar to Transfer learning, active learning using tensorflow object detection api (20)

Ai use cases
Ai use casesAi use cases
Ai use cases
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine Learning
 
maXbox starter75 object detection
maXbox starter75 object detectionmaXbox starter75 object detection
maXbox starter75 object detection
 
Python-oop
Python-oopPython-oop
Python-oop
 
Object detection
Object detectionObject detection
Object detection
 
How to implement artificial intelligence solutions
How to implement artificial intelligence solutionsHow to implement artificial intelligence solutions
How to implement artificial intelligence solutions
 
Object oriented approach in python programming
Object oriented approach in python programmingObject oriented approach in python programming
Object oriented approach in python programming
 
Neural network image recognition
Neural network image recognitionNeural network image recognition
Neural network image recognition
 
The Ring programming language version 1.4.1 book - Part 29 of 31
The Ring programming language version 1.4.1 book - Part 29 of 31The Ring programming language version 1.4.1 book - Part 29 of 31
The Ring programming language version 1.4.1 book - Part 29 of 31
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
 
Designing a neural network architecture for image recognition
Designing a neural network architecture for image recognitionDesigning a neural network architecture for image recognition
Designing a neural network architecture for image recognition
 
GDE Lab 1 – Traffic Light Pg. 1 Lab 1 Traffic L.docx
GDE Lab 1 – Traffic Light  Pg. 1     Lab 1 Traffic L.docxGDE Lab 1 – Traffic Light  Pg. 1     Lab 1 Traffic L.docx
GDE Lab 1 – Traffic Light Pg. 1 Lab 1 Traffic L.docx
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
The Ring programming language version 1.8 book - Part 7 of 202
The Ring programming language version 1.8 book - Part 7 of 202The Ring programming language version 1.8 book - Part 7 of 202
The Ring programming language version 1.8 book - Part 7 of 202
 
C++ [ principles of object oriented programming ]
C++ [ principles of object oriented programming ]C++ [ principles of object oriented programming ]
C++ [ principles of object oriented programming ]
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 
Python programming computer science and engineering
Python programming computer science and engineeringPython programming computer science and engineering
Python programming computer science and engineering
 
07slide.ppt
07slide.ppt07slide.ppt
07slide.ppt
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2
 

Recently uploaded

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 

Recently uploaded (20)

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 

Transfer learning, active learning using tensorflow object detection api

  • 1. Transfer learning, Active learning using tensorflow object detection api
  • 2. Tensorflow Object detection api Google released the tensorflow-based object detection API in June 2017. By offering models with various speeds and accuracy, the general public can easily detect objects.
  • 3. This announcement describes only object detection. The more you go back, the harder the detection will be. The object detection API supports both box and mask types. Object detection Detection of various objects in a single box is generally called object detection. Although not mentioned in this announcement, it is called object sementatin that detects an object according to a shape rather than a box shape. There are various types of image detection. image classification is simply a determination of what the image is. You can also localize from the image classification to the exact location of the object
  • 4. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate Loop Optional The most basic flow of the tensorflow object detection api. All functions are provided to process the data to api, train this data, export the model to a usable form, and test this model. You can also evaluate ongoing or completed models.
  • 5. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate
  • 6. Make tfrecord python generate_tfRecord.py --csv_input=data/train.csv --output_path=data/train.record The data type for using object detection api is tfrecord. The source for creating this tfrecord file is provided, below is how to use it. Set the csv file containing the image information and the output directory.
  • 7. Finally, we need four numeric values to tell where each object is located in the image. xmin, ymin, xmax, ymax values. With this value, you can create a box in which the object is located. There can be multiple objects in an image. Make tfrecord Image Label Xmin , Ymin Xmax , Ymax The data needed to create the tfrecord file is as follows: First, you need the original image file. Next, we need a label for each object in the image.
  • 8. Make tfrecord LabelImg https://github.com/tzutalin/labelImg It is difficult to manually determine the location point of an object. There are various programs that do this automatically. I used the most widely known labelimg.
  • 9. Make tfrecord When you use labelimg, an xml file is created that contains the label of each object and the values of xmin, ymin, xmax, and ymax.
  • 10. Make tfrecord If we have an image to train, xml, and a labelmap that stores the id for each class, we can generate a tfrecord file.
  • 11. Make tfrecord tfgenerator ( custom ) The default code is train and test. Since this is a troublesome task, I have created a program that will do this automatically. When executed, each class is distributed according to the ratio of train, validate, and a log is generated.
  • 12. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate
  • 13. Re train Transfer learning python object_detection/train.py --logtostderr --pipeline_config_path=pipeline.config --train_dir=train object detection api python base code related to train. Set up the model you want to train, and set the training output directory. There are many other options.
  • 14. ssd_mobilenet_v1_coco stands for ssd_mobilenet, which trained Coco Dataset. Pre-trained datasets include COCO, Kitti, and Open Images datasets. Re train Object detection API model zoo ssd_mobilenet_v1_coco Dataset : COCO dataset, Kitti dataset, Open Images dataset. It provides various pre-trained models for object detection of tensorflow. Each model has different speed and accuracy. The first part of the model name is the algorithm, and the second part is the data set.
  • 15. Re train What you want to detect COCO dataset Instance If the class you want to detect is an instance already in coco, kitty, or open image dataset, you do not need to train. Just use the built-in model and select it.
  • 16. Re train What you want to detect ? However, if you want to classify dinosaurs such as Tyrannosaurus and Triceratops as shown in the picture, we need to train on the above classes. Here we have two choices. Create models with completely new layers, or leverage existing models.
  • 17. Re train Make own model It is known that the depth of the neural network and the wider the layer, the higher the accuracy. The picture is google inception v2 model. Deep networks require exponential computing resources.
  • 18. Re train Make own model google TPU 2.0 45 TFLOPS Titan v 15 TFLOPS gtx 1080 9 TFLOPS gtx 970 4 TFLOPS This time I released tpu 3.0 on Google i / o 2018. To simplify the quantification of the flops power of tpu 2.0 and other graphics cards: Perhaps the maximum gpu that a non-enterprise user can have is titan. Google uses a tpu pot that contains 64 tpu. Google may take weeks or even months to do the work for an hour.
  • 19. Teaching a dinosaur to a child can be quite challenging, but teaching a dinosaur to an adult is a lot easier. Transfer learning is similar to teaching dinosaurs to adults. But if you show dinosaur photos to an adult, you will see them as animals by looking at their eyes, legs and tails. Is not that a kind of reptile based on various information such as skin texture? You will think. If you show your baby a photo of a dinosaur, the baby probably has no idea. Babies will not know what their eyes, tails, and feet are, and they will not even have the concept of point, line. Re train Transfer learning ? So we will use a method called transfer learning. There are two people who do not know dinosaurs at all. One is a very young baby and the other is an adult.
  • 20. Re train Transfer learning This photo is a visualization of the weights we have for each layer. Looking closer at the picture, the nearest layer to the input has a low-dimensional feature such as a line, As you go to the output, you can see that it has more detailed high-dimensional features. Since low dimensional features have any object, transfer learning takes it as it is.
  • 21. Re train Transfer learning replace custom class For example, inception v2 model, we take only the weights of previous layers and change only the last fully connected layer to custom labels.
  • 22. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate
  • 23. If you give the checkpoint file that you have trained as an input, you can use frozon_inference_graph.pd The file is created. We can do the prediction from anywhere with this frozon_inference_graph.pd. Export python object_detection/export_inference_graph.py --input_type=image_tensor --pipeline_config_path=pipeline.config --trained_checkpoint_prefix=train/model.ckpt-xxxxx --output_directory=output train_checkpoint_prefix output_directory Once you train your model and reach your goal, we need to export this output to an executable file. Below is the object detection default python export code.
  • 24. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate
  • 25. Evaluate python eval.py --logtostderr --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --checkpoint_dir=training/ --eval_dir=eval/ tensorboard --logdir=./ We would like to quantitatively identify how good this model is during training, or after the train is over. The api to use is evaluate. Below is the python evaluate code provided by the object detection api. We can visually check the evaluate result value through the tensorboard.
  • 26. Evaluate Using tensorboard This is the main screen of the evaluated tensorboard. Let's take a quick look at each one.
  • 27. Evaluate Using tensorboard First, you can see the loss value for classification or localization. The lower the loss value, the better.
  • 28. Evaluate Using tensorboard Second, you can check the accuracy of each class. For different reasons, each character does not have the same classification accuracy.
  • 29. Evaluate Using tensorboard Third, you can measure the accuracy of the entire validation data. Accuracy measurement method is 0.5IoU.
  • 30. Evaluate Using tensorboard IoU is an abbreviation of intersection of union, which is the ratio of the sum of the overlapping areas of the actual ground truth box and the area of the prediction truth box. As you can see in the photo, the accuracy is not 100% even though the predicted value wraps around the actual value.
  • 31. Evaluate Using tensorboard Finally, on the tensorboard, we can see how the estimate of the image changes every evaluation step.
  • 32. Tensorflow object detection helper tool Make tfrecord Re train Export Test Evaluate Automation The set of data creation, train, evaluate, and export described above must be done manually by default. I created a program that automatically feels inconvenienced.
  • 33. Tensorflow object detection helper tool Since you have set all the settings to the default settings that you can run by default, you only need to select the model and enter the training step.
  • 34. Tensorflow object detection helper tool https://github.com/5taku/tensorflow_object_detection_helper_tool You can check the log of the result value and the execution time of each process.
  • 35. Tensorflow Object detection api Make tfrecord Re train Export Test Evaluate
  • 36. Test It is the code that loads the file generated by export and loads it into memory. These images are detected using the loaded graph file. # Path to frozen detection graph. This is the actual model that is used for the object detection. PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb' # Load a (frozen) Tensorflow model into memory. detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
  • 37. Test cloud resource ( google cloud compute engine ) 16 vCPU 60gb Ram 1 x NVIDIA Tesla P100 ubuntu 16.0.4 python 2.7.12 tensorflow 1.8.0 cuda 9.0 cudnn 7.1 All the tests related to the announcement were tested on the Google cloud virtual machine. The specifications are as follows.
  • 38. Test dataset The dataset used in the test was a simpson dataset from kaggle.
  • 39. Test dataset name total bounding_box Homer Simpson 2246 612 Ned Flanders 1454 595 Moe Szyslak 1452 215 Lisa Simpson 1354 562 Bart Simpson 1342 554 Marge Simpson 1291 557 Krusty The Clown 1206 226 Principal Skinner 1194 506 Charles Montgomery Burns 1193 650 Milhouse Van Houten 1079 210 Chief Wiggum 986 209 Abraham Grampa Simpson 913 595 Sideshow Bob 877 203 Apu Nahasapeemapetilon 623 206 Kent Brockman 498 213 Comic Book Guy 469 208 Edna Krabappel 457 212 Nelson Muntz 358 219 Each character is fully labeled. (300 to 2000) Some of them have box data. (200 to 600) The number of data is not constant for each character.
  • 40. Train , Evaluate , Export model = faster rcnn resnet 50 training step = 140,000 training : validate rate = 8 : 2 I have trained 140,000 faster rcnn resnet 50 pre-train models with my tool. Performs evaluation every 10,000 times. Total training time is approximately 6 hours and 30 minutes.
  • 41. Test tensor board result The overall iou value was 0.67, which did not perform well. IoU for each character is also inferior in performance.
  • 42. Test handmade check I did not touch the figures well, so I decided to check the image by inserting the actual image. Twenty images per character were predialized. There are a few errors that I see in my eyes :( The criteria for each T F N are as follows. (If multiple labels are detected on one object, the highest accuracy is T.) True ( T ) False ( F ) None ( N )
  • 43. Improved accuracy 1. Change Model 2. Increase data a. Data augmentation b. Labelling c. Active learning 3. etc…. The result of training 6,000 data for 140,000 times is not satisfactory. How can I improve my accuracy? There are many ways to increase accuracy.
  • 44. Improved accuracy change model Model name Speed (ms) COCO mAP[^1] Size Training time Outputs ssd_mobilenet_v1_coco 30 21 86M 9m 44s Boxes ssd_mobilenet_v2_coco 31 22 201M 11m 12s Boxes ssd_inception_v2_coco 42 24 295M 8m 43s Boxes faster_rcnn_inception_v2_coco 58 28 167M 4m 43s Boxes faster_rcnn_resnet50_coco 89 30 405M 4m 28s Boxes faster_rcnn_resnet50_lowproposals_coco 64 405M 4m 30s Boxes rfcn_resnet101_coco 92 30 685M 6m 19s Boxes faster_rcnn_resnet101_coco 106 32 624M 6m 13s Boxes faster_rcnn_resnet101_lowproposals_coco 82 624M 6m 13s Boxes faster_rcnn_inception_resnet_v2_atrous_coco 620 37 712M 18m 6s Boxes faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco 241 712M Boxes faster_rcnn_nas 1833 43 1.2G 47m 49s Boxes faster_rcnn_nas_lowproposals_coco 540 1.2G Boxes In choosing a model, tensorflow already offers several models. We only need to select the model considering the accuracy, prediction speed, and training speed. In the tests I've done, resnet50, inception v2 has guaranteed the best performance.
  • 45. Improved accuracy Data augmentation original size up size down brightness up brightness down Rotating, zooming in and out using a single image is a widely used method of increasing data. However, if you change the size of the image, you have to change the box position as well, which will be quite cumbersome.
  • 46. Improved accuracy labelling Too Expensive COST!!! Another way to increase data is to manually create one object box boundary. Manual labeling results in many labeling costs.
  • 47. Active learning Introducing Active learning, which allows labeling of images at low cost. The basic concept is to use the data obtained by predicting the candidate data with the model made by using a small amount of data, and to use the result as the training data again. Of course, the predicted result is verified by the person or machine named oracle.
  • 48. Active learning Training labeled data unlabeled data pool Prediction Query if detection label == image label an oracle Simpson dataset is an ideal data set for active learning because each unboxed image is labeled. The resnet 50 algorithm is trained 20,000 times with 6,000 box data provided initially. Then, 12,000 unboxed images are predicted, and if the prediction result matches the image label, it is moved to the train dataset.
  • 49. Active learning The number of images adopted in the training set is as follows. At the first prediction, two times the image was adopted, and since then the number has been reduced, but it has been adopted as a steady train set. ID NAME 1 step 2 step 3 step 4 step 5 step 6 step 7 step 1 homer_simpson 626 1830 1980 2018 2050 2061 2066 2 ned_flanders 607 1214 1335 1355 1381 1387 1388 3 moe_szyslak 215 692 945 1121 1170 1238 1246 4 lisa_simpson 578 1116 1217 1234 1271 1271 1271 5 bart_simpson 571 1133 1243 1256 1256 1265 1272 6 marge_simpson 568 1229 1244 1248 1248 1250 1250 7 krusty_the_clown 239 558 1002 1139 1145 1147 1147 8 principal_skinner 519 1054 1110 1114 1116 1119 1122 9 charles_montgomery_burns 663 1058 1085 1099 1104 1105 1107 10 milhouse_van_houten 223 776 897 925 952 954 965 11 chief_wiggum 206 561 788 837 858 863 867 12 abraham_grampa_simpson 608 844 875 878 878 878 878 13 sideshow_bob 202 217 420 651 733 760 772 14 apu_nahasapeemapetilon 223 537 570 581 588 588 590 15 kent_brockman 217 250 270 324 371 415 415 16 comic_book_guy 224 351 387 395 398 403 403 17 edna_krabappel 226 227 327 370 384 386 390 18 nelson_muntz 217 284 287 288 297 297 299 Total 6932 13931 15982 16833 17200 17387 17448 Increase rate 6999 2051 851 367 187 61
  • 50. train , evaluate , export The first and seventh data split ratios. Because the number of images per character differs a lot, the number of training is uneven. However, the number of training data in all classes increased by 30% ~ 300%.
  • 51. train , evaluate , export Because there is a process of predicting 11,000 images, it took 21 hours for a task that took 6 hours and 30 minutes to work with active learning.
  • 52. Test tensor board result When the localization loss value was learned from the existing 6900 chapters, it decreased from 0.115 to 0.06 after applying active learning.
  • 53. Test tensor board result Accuracy for each class has also gradually decreased. In active learning, the initial graph value is suddenly suddenly the train data is suddenly increased, but it is not certain.
  • 54. Test tensor board result 0.66 0.87 IOU accuracy for the entire class has been increased from 0.66 to 0.87.
  • 55. Active learning As data increases, the accuracy of each test case increases. Accuracy increases rapidly in the early days and continues to increase in the future.
  • 57. Active learning Data 17,448 0 to 140,000 0 to 140,000 active learning every 20,000 Data 6,932 0 to 140,000 The first picture is the result of learning 6,932 pieces of data 140,000 times. The second picture is the result of learning 140,000 times while adding training data to active learning every 20,000 times. The last picture is the result of learning 140,400 data from the beginning of 17,448 data from the last result of active learning.
  • 58. Active learning Data 17,448 0 to 140,000 0 to 140,000 active learning every 20,000 Data 6,932 0 to 140,000 Looking at the results, some characters seem to show little increase in accuracy. This is because the distribution of the entire dataset per class is so small that even if active learning is used, the percentage of the character in the entire data set is rather lower. comic_book_guy : 3.23 % edna_krabappel : 3.26 % nelson_muntz : 3.13 % comic_book_guy : 2.30 % edna_krabappel : 2.23 % nelson_muntz : 1.71 %
  • 59. Active learning For kent brockman, the accuracy is fairly high, even though the data rate is 2.37%. Though the test data may be luckily well-detected, I think that the characteristic white hair represents different classes and salient features. kent_brockman : 2.37 %
  • 60. Thank you! 1. Tensorflow Object detection API 2. Transfer learning 3. Object detection API helper tool 4. Active learning ( with test result ) Today we looked at the entire Tensorflow object detection API. I also had a brief introduction to the concepts of tranfer learning and active learning and the helper tool I created. We have confirmed that the accuracy of actual data is increased by active learning with low labeling cost.

Editor's Notes

  1. 구글은 tensorflow기반의 object detection api를 2017년 6월에 공개하였습니다. 다양한 속도와 정확도를 가지는 모델들을 선 제공하여, 일반인들도 쉽게object를 디텍션 할수 있게 되었습니다.
  2. 이미지를 디텍션 하는 여러가지 종류가 있습니다. 단순하게 한장의 이미지가 무엇인지 판별하는 image classification, image classification 에서 정확한 위치까지 localization 할수도 있습니다. 한장의 이미지에서 다양한 object 를 box 형태로 detection 하는 것을 일반적으로 object detection 이라고 합니다. 이번 발표에서 언급하지는 않지만, object 를 box형태가 아닌 shape을 따라서 detection 해주는것을 object sementatin이라고 합니다. 뒤로 갈수록 난이도가 올라가고 object detection api는 box형태와 mask형태 모두를 지원합니다.
  3. tensorflow object detection api의 가장 기초적인 흐름입니다. api에 맞도록 data를 가공하고 이 데이터를 training하고, 모델을 사용가능한 형태로 export하고 , 이 모델을 test 하는 모든 기능이 제공됩니다. 또한 진행중인 또는 진행이 완료된 모델을 평가할 수도 있습니다.
  4. object detection api 를 사용하기 위한 데이터 타입은 tfrecord 파일입니다. 이 tfrecord 파일을 만들기 위한 소스가 제공되고, 아래는 사용 방법입니다. 이미지의 정보가 들어있는 csv 파일과 output directory를 셋팅합니다.
  5. tfrecord 파일의 생성에 필요한 데이터는 다음과 같습니다. 우선, 원본 이미지 파일이 필요합니다. 다음으로는 이미지에 포함된 각 오브젝트의 라벨이 필요합니다. 마지막으로, 각 오브젝트가 이미지에서 어느곳에 위치하고 있는지 알려주는 4개의 숫자값이 필요합니다. xmin , ymin , xmax , ymax 값입니다. 이 값이 있으면, 오브젝트가 위치하는 box형태를 만들수 있습니다. 하나의 이미지에서는 여러개의 object 가 존재할 수 있습니다.
  6. 오브젝트의 위치 포인트를 수작업으로 알아내기는 어렵습니다. 이를 자동으로 해주는 다양한 프로그램이 있습니다. 저는 가장 널리 알려진 labelimg를 사용하였습니다.
  7. labelimg 를 사용하면, 각 object 의 label과 xmin , ymin , xmax , ymax 값이 포함된 xml파일이 생성됩니다.
  8. 트레이닝 시킬 이미지, xml , 각 클래스의 id를 저장하고 있는 labelmap이 있으면 우리는 tfrecord파일을 생성할 수 있습니다.
  9. 기본 코드로는 train, test 파일을 따로 생성해줘야 합니다. 해당 작업은 귀찮은 작업이기 때문에 이를 자동으로 하여주는 프로그램을 만들었습니다. 실행하면 각 클래스가 train, validate 비율대로 알아서 분배되고, 로그가 생성됩니다.
  10. train에 관련한 object detection api python 기본 코드입니다. 트레이닝 시키려는 모델에 대한 설정, training output 디렉토리 등을 설정합니다. 이 외에도 많은 옵션이 존재합니다.
  11. tensorflow 를 object detection 하기위한 pre-train된 여러가지의 모델을 미리 제공하여 줍니다. 각 모델은 서로 다른 속도와 정확도를 가집니다. 모델 이름의 앞부분은 알고리즘 , 뒷부분은 데이터셋을 의미합니다. ssd_mobilenet_v1_coco는 코코데이터 셋을 트레이닝한 ssd_mobilenet 이라는 의미입니다. pre-train 된 데이터셋은 COCO , Kitti , Open Images dataset 이 있습니다.
  12. 만약 여러분이 디텍션 하려는 클래스가 coco나 kitti, open image dataset 에 이미 있는 instance라면, training 작업을 할 필요가 없습니다. 그냥 기본 제공되는 모델을 적절하게 선택하여 쓰시기만 하면 됩니다.
  13. 하지만 사진과 같이 티라노사우르스, 트리케라톱스 등의 공룡을 분류하고 싶다고 한다면, 우리는 위의 클래스에 대한 트레이닝 작업을 거쳐야 합니다. 여기서 우리는 두가지 선택을 할수 있습니다. 완전히 새로운 레이어를 쌓은 모델을 만들거나, 기존 모델을 활용하거나.
  14. 알려진 바에 의하면, 신경망이 깊고 레이어가 넓을수록 정확도는 높아집니다. 사진은 google inception v2 모델입니다. 상당히 깊은 망을 사용하고 , 망이 깊어질수록 기하급수적인 컴퓨팅 리소스가 필요합니다.
  15. 이번에 구글 i/o 2018에서 tpu 3.0을 발표하였습니다. tpu 2.0과 다른 그래픽카드의 flops 파워를 단순 수치화 해보면 다음과 같습니다. 아마 기업이 아닌 일반 사용자가 가질수 있는 최대한의 gpu는 titan 정도일 것입니다. 구글은 tpu 64개를 묶은 tpu pot 를 사용합니다. 구글이 한시간이면 할 작업을 일반인들은 몇주, 몇달이 걸릴수도 있습니다.
  16. 따라서 우리는 transfer learning 이라는 방법을 사용할 것입니다. 여기 공룡을 전혀 알지 못하는 두사람이 있습니다. 한명을 아주 어린 아기이고 다른 한명은 성인입니다. 아기에게 공룡사진을 보여주면, 아기는 아마도 아무 생각도 없을것입니다. 아기는 눈, 꼬리, 발이 무엇인지도 모르고, 심지어 점, 선의 개념도 없을 것입니다. 하지만 성인에게 공룡사진을 보여준다면, 눈, 다리, 꼬리를 보고 동물이라고 인식하게 될 것이고, 피부 질감 등등 여러 정보를 바탕으로 파충류 종류가 아닐까? 라고 생각하게 될것입니다. 아이에게 공룡을 가르치는것은 상당히 어렵겠지만, 어른에게 공룡을 가르치는것은 훨씬 쉬울것입니다. transfer learning은 어른에게 공룡을 가르치는 것과 비슷합니다.
  17. 이 사진은 각 레이어 마다 가지고 있는 가중치를 시각화 한 것입니다. 사진을 자세히 보면, input에서 가까운 쪽 레이어는 선과 같은 저차원 특징을 지니고, output 으로 갈수록, 더 디테일한 고차원 특징을 가짐을 알수 있습니다. 저차원의 특징들은 어느 object 나 가지고 있으므로, transfer learning 에서는 이를 그대로 가지고 갑니다.
  18. inception v2 모델을 예로 들자면, 이전의 레이어들의 가중치는 그대로 가지고 가고 마지막 fully connected layer만 custom label들로 바꿔서 이 가중치만을 학습시킵니다.
  19. 우리는 모델을 training 시키는 도중이나, train이 끝난후에, 이 모델이 얼마나 좋은 모델인지 정량적으로 확인해보고 싶을것입니다. 이때 사용하는 api가 evaluate 입니다. 아래는 object detection api가 제공하는 python evaluate 코드입니다. 우리는 evaluate 결과값을 tensorboard를 통하여 시각적으로 확인할 수 있습니다.
  20. evaluation 된 tensorboard 의 메인화면입니다. 각 내용에 대하여 간단하게 살펴보겠습니다.
  21. 첫번째로 classification 이나 localization 에 대한 loss 값을 확인할 수 있습니다. loss 값은 낮을수록 좋습니다.
  22. 두번째로 각 class 마다의 정확도를 확인할 수 있습니다. 여러가지 이유로 각 캐릭터마다 동일한 분류 정확도를 가지지 않습니다.
  23. 세번째로 전체 validation 데이터에 대한 정확도를 측정할 수 있습니다. 정확도 측정 방식은 0.5IoU입니다.
  24. IoU란 intersectin of union의 약자로써, 실제 groundtruth box와 prediction truth박스의 면적에서 겹쳐지는 면적의 합의 비율입니다. 사진에서 보이는것처럼 예측값이 실제 값 전체를 감싸고 있어도 정확도가 100%라고 할 수 없습니다.
  25. 마지막으로 tensorboard 에서 우리는 evaluation 스텝마다 이미지의 예측값이 어떻게 변화하는지 확인할 수 있습니다.
  26. 위에서 설명한 일련의 data creation, train , evaluate , export는 기본 api는 하나하나 수작업으로 수행해주어야 합니다. 불편함을 느껴 자동으로 수행하여 주는 program을 만들었습니다.
  27. 모든 설정을 기본설정으로 실행가능한 기본설정으로 맞추어 두었으므로 모델의 선택과 training step만 입력하면 됩니다.
  28. 각 process 가 수행된 결과값과 수행시간을 로그로 확인할 수 있습니다.
  29. export 하여 생성된 파일을 로딩하고 메모리에 로딩하는 코드입니다. 이렇게 로딩된 그래프 파일을 활용하여 이미지들을 detection 하게 됩니다.
  30. 발표에 관련된 모든 테스트는 구글 클라우드 virtual machine에서 테스트 하였습니다. 사양은 다음과 같습니다.
  31. 테스트에 사용된 데이터 셋은 kaggle 의 simpson dataset 을 사용하였습니다.
  32. 각 캐릭터들은 전부 라벨링 되어 있습니다. ( 300 ~ 2000 ) 그중 일부의 이미지들은 box 데이터를 가지고 있습니다. ( 200 ~ 600 ) 각 캐릭터별로 data 의 갯수는 일정하지 않습니다.
  33. 제가 만든 tool로 faster rcnn resnet 50 pre-train모델을 140,000 트레이닝 시켰습니다. 10,000번마다 evaluation을 수행합니다. 총 training 시간은 대략 6시간 30분 정도입니다.
  34. 각 캐릭터별 IoU도 성능이 좋지 않습니다.
  35. 수치로는 잘 와닿지 않아서 실제 이미지를 넣어서 확인해보기로 하였습니다. 각 캐릭터당 20장씩 이미지를 predIction 시켰습니다. 제 눈으로 확인하여서 약간의 오류는 존재합니다 :( 각각 T F N의 기준은 다음과 같습니다.
  36. model을 선택함에 있어 tensorflow 는 이미 여러가지의 model을 제공합니다. 우리는 정확도, prediction 속도, training 속도 등을 잘 고려하여 모델을 선택하기만 하면 됩니다. 제가 해본 테스트 상에선 resnet50 , inception v2가 가장 좋은 성능을 보장했습니다.
  37. 한장의 이미지를 사용하여 회전하거나, 확대, 축소하는 방법은 이전부터 널리 사용하던 데이터 증가 방법입니다. 하지만 이미지의 사이즈가 변경되면 이와 같이 box 위치도 변경시켜주어야 하며, 이는 상당히 번거로운 작업이 될것입니다.
  38. 다른 데이터 증가 방법으로는 수작업으로 한장한장 object box boundary를 생성하는 방법입니다. 수작업 라벨링은 많은 라벨링 비용을 초래합니다.
  39. 적은 비용으로 이미지의 라벨링을 할수 있는 Active learning 을 소개합니다. 기본 컨셉은 적은 데이터를 사용하여 만든 모델로 후보군 데이터를 프레딕션 시켜서 나온 결과값을 다시 트레이닝 데이터로 활용하는 것입니다. 물론 예측된 결과값은 oracle이라는 사람또는 기계에 의하여 확인작업을 수행합니다.
  40. simpson dataset은 unboxed image가 각각 labeling 되어 있어서, active learning 을 하기에 아주 적합한 데이터 셋입니다. 최초 제공되는 box 데이터 6,000여장으로 resnet 50 알고리즘을 20,000번 트레이닝 시킵니다. 이후 unboxed image 12,000여장을 prediction 시켜서 prediction 결과값이 image label과 일치하면 이를 train dataset으로 옮깁니다.
  41. 7번의 스텝을 진행하였고, training set 으로 채택된 image 의 갯수는 다음과 같습니다. 처음 prediction 하였을때 2배정도의 이미지가 채택되었고, 그 이후로 갯수는 줄어들지만 꾸준하게 train set 으로 채택되었습니다.
  42. 첫번째와 7번째 데이터 분할 비율입니다. 캐릭터당 이미지 갯수가 많이 차이가 나기때문에 트레이닝 개수가 고르지 않습니다. 하지만 모든 클래스의 training 데이터 수가 30% ~ 300% 증가하였습니다.
  43. 11,000장의 이미지를 prediction 하는 과정이 있기 때문에, 기존에 6시간 30분걸리는 작업이 active learning 을 적용한 작업에선 21시간이 걸렸습니다.
  44. localization loss 값이 기존 6900장으로 학습하였을때, 0.115 에서 active learning 적용한 후에는 0.06으로 줄어들었습니다.
  45. 각 클래스별 정확도 역시 점진적으로 줄어들었습니다. active learning 시 초반 그래프 값이 어지러운 건 갑자기 train 데이터가 급격히 증가해서 인것 같으나 확실하진 않습니다.
  46. 전체 클래스에 대한 IOU 정확도는 0.66 에서 0.87로 증가하였습니다.
  47. 데이터가 증가하면서 각 테스트 케이스별 정확도 증가추이 입니다. 정확도가 초반에 급격하게 증가하고 이후에도 지속적으로 증가합니다.
  48. 결과 값을 보면, 몇몇 캐릭터들은 정확도의 증가가 거의 없어 보입니다. 이는 active learning 을 해도 전체 데이터셋의 클래스당 분포가 워낙 적어, 전체 데이터셋에서 해당 캐릭터가 차지하는 비율이 오히려 더 내려가서 입니다.
  49. kent brockman 의 경우는 데이터 비율이 2.37 % 임에도 불구하고 정확도는 상당히 높습니다. 테스트 데이터가 운이 좋게 잘 detection 된 것일수도 있지만, 저는 특징적인 백발이 다른 클래스와 두드러진 특징을 나타내어서 라고 생각합니다.
  50. 오늘 우리는 Tensorflow object detection api 전반에 대하여 알아보았습니다. 또한, tranfer learning , active learning 의 개념과 제가 만든 helper tool에 대한 소개를 잠깐 했었구요. active learning 으로 적은 labeling cost로 실제 데이터의 정확도가 올라감을 확인하였습니다.