2. What is optical flow estimation?
● Optical flow...
represents the apparent motion of objects.
● Optical flow estimation...
can predict the movement of objects in a video.
Miloud, Hadj achour. (2017). Fragmentation de métal liquide dans l’eau.
https://www.codeproject.com/Articles/1192205/Capturing-motion-from-video-using-the-Emgu-CV-libr
2
3. Optical flow estimation is important
● Widely used by insects and birds
● Practical usage
○ Analyze motion
○ Avoid collision
○ Assist in navigation
● Real-world Applications
○ Video/Motion classification
○ Navigation assistance
■ Self driving cars
■ Drones https://nanonets.com/blog/optical-flow/
3
4. DL approaches are increasing
4
● 2015. FlowNet S (Simple)
● 2015. FlowNet C (Correlated)
● 2016. FlowNet 2
● 2018. LiteFlowNet
● ...
Color coding
https://www.youtube.com/watch?v=k_wkDLJ8lJE https://www.youtube.com/watch?v=pfQ0zFwv-hM
5. However...
● Existing DL approaches require GPU to execute
👎 High power consumption
👎 Low runtime speed on CPU environment
● We propose LmFlowNet S 👍
○ Modification of FlowNet S [P. Fischer+ 2015]
○ Goals:
■ Edge Computing
■ Run on FPGA-based accelerator
■ Use quantization to reduce inference time
while achieving good prediction performance
5
7. Network of FlowNet S
7
[N, 384, 512, 6]
[N, 384, 512, 2]
Detailed ops inside each color block is shown in appendix
8. Network of FlowNet S
8
[N, 384, 512, 6]
[N, 384, 512, 2]
Encoder
Decoder Detailed ops inside each color block is shown in appendix
9. Network of FlowNet S
9
[N, 384, 512, 6]
[N, 384, 512, 2]
Detailed ops inside each color block is shown in appendix
Not supported
by Blueoil DLK
10. 10
Network of LmFlowNet S (DLK supported)
[N, 384, 512, 6]
[N, 384, 512, 2]
Detailed ops inside each color block is shown in appendix
11. 11
Network of LmFlowNet S
[N, 384, 512, 6]
[N, 384, 512, 2]
Quantized
Detailed quantization inside each color block is shown in appendix
12. Loss function: End Point Error (EPE)
12
(x1
,y1
)
(x2
,y2
)
EPEflow2
EPEflow3
EPEflow4
EPEflow5
EPEflow6
Weighted EPE =
0.32 * EPEflow2
+
0.08 * EPEflow3
+
0.02 * EPEflow4
+
0.01 * EPEflow5
+
0.005 * EPEflow6
Down-
sampled
Ground
Truth
Training hyper-parameters are shown in appendix
24 x 32
48 x 6496 x 128
192 x 256
12 x 16
13. Artificial dataset: Flying Chairs
13
● Dataset
Name Frame pairs Train validation ratio size
Flying Chairs 22,872 9:1 30GB
● Data Augmentation
○ Crop, Rotate, Translate, FlipLeftRight, FlipTopBottom
○ Gaussian noise, Brightness, Contrast, Gamma, and Color
Parameters used in data augmentation are shown in appendix
https://arxiv.org/pdf/1504.06852.pdf
15. 15
Results - Avg. EPE & Inference time
Method
Avg. EPE (pixel)
(Flying Chairs)
Inference time per frame (ms) [1]
CPU (dlk-convert) GPU (tensorflow)
FlowNet S 2.94 - 11.65
LmFlowNet S 5.33 1360.49 13.81
LmFlowNet S
Quantized
9.01 637.467 17.60
[1] CPU and GPU specs available in appedix
FlowNet S
Trained for 1.2M
LmFlowNet S
Trained for 400K
LmFlowNet S Quantized
Trained for 400K
17. 17
Live demonstration
● Three demonstration
○ FlowNet S
○ LmFlowNet S
○ LmFlowNet S Quantized
● NOTE: Running on GPU (not on CPU / FPGA)
○ Failed to run on CPU/FPGA due to several problems 😢
■ etc. segmentation fault, memory error...
○ Fixing and debugging them in the future 👊
20. Challenges
● Training takes a very long time ( > 2 weeks...😢)
○ Heavy data augmentation & pre-processing
■ Pre-processing on GPU is not supported now.
● Unique network structure, not compatible with Blueoil
○ Input is a stack of 2 images (6 channels)
○ Multiple and branched outputs
● DLK Limitation. No documentation. 🤯
○ No support for kernel size 7x7, 5x5
○ No support for Conv2dTranspose
○ Cannot concat quantized value and float together
○ Requires the depth of Space2Depth to be 32 * N 20
21. Thank you for your
attention!
our source code:
https://github.com/ki-lm/blueoil/tree/lmflownets
21
26. LmFlowNet S | Data Augmentation
● Translation: [20%, 20%] of the image width for x and y
● Rotation: [17o
, 17o
]
● Scaling: [0.9, 2.0]
● Gaussian noise: sigma uniformly sampled from [0, 0.04]
● Contrast: [0.8, 0.4]
● Multiplicative color changes to the RGB channels per
image: [0.5, 2]
● Gamma values: [0.7, 1.5]
● Additive brightness changes: Gaussian with a sigma of
0.2
26
28. Our FlowNet S versions
28
Version Architecture DLK support
V1
(FlowNet S)
Same as the paper X
V2
7x7, 5x5 => 3x3
Striding 2 => SpaceToDepth
X
V3
Conv2dTranspose =>
ResizeNearestNeighbor + Conv2d
ResizeBilinear =>
ResizeNearestNeighbor
X
V3 Quant.
(LmFlowNet S)
Quantize except first, last layer,
and activation before last layer
△
V4 Quant.
Change all output depths from
SpaceToDepth to 32 * N
O
29. List of source code links
● FlowNet S/C, and 2 (TensorFlow):
https://github.com/sampepose/flownet2-tf/
● FlowNet S/C (Original paper, Caffe):
https://lmb.informatik.uni-freiburg.de/Publications/2015/DF
IB15/
● FlowNet 2 (Original paper, Caffe):
https://github.com/lmb-freiburg/flownet2
29