This document outlines the steps to implement a vision-based deep learning solution using open source tools, including data collection, annotation, training and inference. It discusses collecting data through web crawling, live video recording, and image capture. Data is then preprocessed, labeled, and annotated using open source tools. A YOLO object detection model is trained on labeled data using the DarkFlow framework. The trained model is then deployed for inference on edge devices using Intel OpenVINO. Challenges discussed include the need for large amounts of varied data, iterative tuning, and automation of annotation.
Implementing Vision DL Solution using Open Source Tools
1. Unearth the Journey of
Implementing Vision based
Deep Learning Solution using
Open Source Tools
Neethu Elizabeth Simon
Software Engineer
Intel Corporation
Phoenix, AZ
10. Implementation
• Build Model
• Improve Models
to meet
performance &
accuracy
objectives
• Deploy Model
• Make Predictions
by applying the
Trained Model
on new Data
• Label Data• Data
Acquisition &
Organization
• Preprocessing
Collect Annotation
TrainingInference
17. Implementation – Training
Collect Annotation
TrainingInference
dried_chilli_pepper
cinnamon_stick
Lots of
Labeled Data !!!
Model Weights
Forward
Backward
18. Implementation – Training
Collect Annotation
TrainingInference
Data
• Type/Amount – Visual, Sensors
• Labeled Datasets – PASCAL VOC, COCO, KITTI, ImageNet or
ILSVRC, SUN
• Custom Data
19. Implementation – Training
Collect Annotation
TrainingInference
Compute Platform
• Complexity Determined by Data
• Cloud vs Edge
• Intel® Xeon vs Core vs Atom
Data
• Type/Amount – Visual, Sensors
• Labeled Datasets – PASCAL VOC, COCO, KITTI, ImageNet or
ILSVRC, SUN
• Custom Data
Intel
Core +
GPU
20. Implementation – Training
Collect Annotation
TrainingInference
Object Detection
Algorithm
• YOLO – You Only Look Once
• SSD, R-CNN, Fast R-CNN, Faster RCNN, R-FCN
Compute Platform
• Complexity Determined by Data
• Cloud vs Edge
• Intel® Xeon vs Core vs Atom
Data
• Type/Amount – Visual, Sensors
• Labeled Datasets – PASCAL VOC, COCO, KITTI, ImageNet or
ILSVRC, SUN
• Custom Data
Intel
Core +
GPU
YOLO
21. Implementation – Training
Collect Annotation
TrainingInference
Deep Learning
Framework
• DarkNet - C & CUDA - https://pjreddie.com/darknet/
• DarkFlow - Tensorflow implementation of DarkNet
https://github.com/thtrieu/darkflow
• Others – PyTorch, Caffe
Object Detection
Algorithm
• YOLO – You Only Look Once
• SSD, R-CNN, Fast R-CNN, Faster RCNN, R-FCN
Compute Platform
• Complexity Determined by Data
• Cloud vs Edge
• Intel® Xeon vs Core vs Atom
Data
• Type/Amount – Visual, Sensors
• Labeled Datasets – PASCAL VOC, COCO, KITTI, ImageNet or
ILSVRC, SUN
• Custom Data
YOLO
Intel
Core +
GPU
DarkFlow
22. Implementation – Training
Collect Annotation
TrainingInference
Deep Learning
Framework
• DarkNet - C & CUDA - https://pjreddie.com/darknet/
• DarkFlow - Tensorflow implementation of DarkNet
https://github.com/thtrieu/darkflow
• Others – PyTorch, Caffe
Object Detection
Algorithm
• YOLO – You Only Look Once
• SSD, R-CNN, Fast R-CNN, Faster RCNN, R-FCN
Compute Platform
• Complexity Determined by Data
• Cloud vs Edge
• Intel® Xeon vs Core vs Atom
Data
• Type/Amount – Visual, Sensors
• Labeled Datasets – PASCAL VOC, COCO, KITTI, ImageNet or
ILSVRC, SUN
• Custom Data
YOLO
Intel
Core +
GPU
DarkFlow
Images - 1000 Images
Training Time - 10 hrs
Model Size – 800 MB
Average loss – ~0.9
23. Implementation – Training
Collect Annotation
TrainingInferencedried_chilli_pepper
cinnamon_stick
Lots of
Labeled Data !!!
Model Weights
Forward
Backward
python flow --train --model cfg/yolov2-2c.cfg --load bin/yolov2.weights --annotation all_Spice_Annotations/ --dataset all_Spice_Images/
DarkFlow
24. Implementation – Inference
Collect Annotation
TrainingInference
Model Weights
Forward
dried_chilli_pepper
Edge Device – Laptop/NUC/Mobile
????
Intel® OpenVINO Toolkit
https://software.intel.com/en-
us/openvino-toolkit
Open Source Tool for Inference
Support Different Models
(Caffe/Tensorflow/MXNet/ONNX/Kaldi)
Model Optimization
Speed Deploymentpython flow --imgdir all_test_images/ --model cfg/yolov2-2c.cfg --load new_model_weights
DarkFlow
28. Implementation Challenges
• More is Better
• Object Orientation, angle,
lighting
• Remove Bad/Unrelated
Data
• Require SME knowledge
to define bad data &
remove unrelated data
Collect Annotation
TrainingInference
30. Implementation Challenges
• Manual vs
Automation
• Time consuming
• Expensive
• Difficult to Scale
• More is Better
• Object Orientation,
angle, lighting
• Remove Bad/Unrelated
Data
• Require SME knowledge
to define bad data &
remove unrelated data
Collect Annotation
TrainingInference
31. Implementation Challenges
• Iterative Learning
• Varied Data
parameters &
tuning
• Hyperparamter
optimization & data
overfitting
• High Compute
• Manual vs
Automation
• Time consuming
• Expensive
• Difficult to Scale
• More is Better
• Object Orientation,
angle, lighting
• Remove Bad/Unrelated
Data
• Require SME knowledge
to define bad data &
remove unrelated data
Collect Annotation
TrainingInference
32. Implementation Challenges
• Iterative Learning
• Varied Data
parameters &
tuning
• Hyperparamter
optimization & data
overfitting
• High Compute
• Constantly Update
Models
• Complexity in Hardware
design – combination of
CPU/GPU/Accelerators
• Data Privacy & Security
• Manual vs
Automation
• Time consuming
• Expensive
• Difficult to Scale
• More is Better
• Object Orientation,
angle, lighting
• Remove Bad/Unrelated
Data
• Require SME knowledge
to define bad data &
remove unrelated data
Collect Annotation
TrainingInference