ICESS 2016, Takamatsu, Japan
14 ~ 16 Nov. 2016
Young-Min Kang
Tongmyong University
A Parallel Approach to Object Identification
In Large-scale Images
Sung-Soo Kim, ETRI Gyung-Tae Nam, GCSC Inc.
Bigger images
• Era of Big data
– Increased sizes of images data
• Image processing
– Heavy Computation
• One of the most fundamental operations
– Object identification/recognition
• Image segmentation
• Connected components labeling
Connected component labeling
• Objective
– Pixels in a connected component have an identical labels
Parallel image processing
• Most image processing algorithms
– Pixel-wise operations
• can be implemented with pixel-wise threads
• can be efficiently performed in a data-parallel fashion
• GPU
– Data parallel device
– can be easily applied to various image processing methods
GPU:
Many-core architecture
Pixel connectivity
• Graph representation
Image Pixel connectivity
CCL and parallelism
• CCL with graph traversal
– cannot be easily parallelized
• Traversal = sequential
• GPU based approaches
– has not been very successful
Our method
• GPU-based efficient algorithm for CCL
– Data initialization
– Computing column-wise label runs
– Efficient label merge
Data initialization
• Each pixel is assigned unique label if it is
turned on
Data initialization
• Each pixel is assigned unique label if it is
turned on
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
Data initialization
• Each pixel is assigned unique label if it is
turned on
1 2 -1 -1 -1
6 7 -1 9 10
11 12 -1 14 -1
-1 -1 -1 19 20
-1 -1 -1 -1 -1
Column-wise label runs
• Run
– Block of contiguous object pixels in a column
• Computing column-wise label runs
– Can be done with w threads
h
w
Column-wise label runs
• Label change within a column (1 thread)
Column-wise label runs
• Graph-based interpretation
Column-wise label runs
• Implementation
Label merge
• After computing “column-wise label runs”
– We have separate trees to be merges in accordance
with their connectivity
• What is needed
– Checking vertical adjacency
Label merge
• Connectivity check
Label merge
• Connectivity check
Label merge
• Connectivity check
Label merge
• Connectivity check
Label merge
• Updated hierarchy
Why only roots are changed
Let’s merge
OK! I will
follow you
Why only roots are changed
Merged tree
Previous methods
1. Check the connectivity
2. Update the hierarchy
3. Iterate this process until no update is made
A kind of graph traversal
Heavy computation when the pixels make a
long connected chain
Our method
• Label merge is performed with fixed
number of iterations
– The number of iteration
• log2(w)
– Computation cost at every iteration
• reduced to be the half the previous one
• Efficient label merge
• Moreover
– Can be easily parallelized
Label merge boundary
• 1st merge
w/2 boundaries
h comparisons in each boundary
 wh/2 threads
Label merge boundary
• 2nd merge
w/22 boundaries
h comparisons in each boundary
 wh/22 threads
Label merge boundary
• 3rd merge
w/23 boundaries
h comparisons in each boundary
 wh/23 threads
Label merge boundary
• Final merge
log2(w) –th merge
Computation cost at the 1st merge: C(1)
Total Cost
Performance
• Computational cost for each task
– Cost for Initialization = 1
– 4096x4096 images with different number of connected components
50 labels 1869 labels
initialization 1.0 1.0
column-wise run 1.6 1.6
label merge 3.4 3.6
Performance
• Computational cost for each task
– Cost for Initialization = 1
– 4096x4096 images with different number of connected components
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
initialization column-wise run label merge
50 labels
1869 labels
Experimental results
• Reference
– Grana’s method implemented with OpenCV
• Two Tests
– Random noise with varying densities
– Object identification with shapes
Varying densities
• Image size: 2048x2048
Varying densities
• Image size: 2048x2048
Varying densities
• Image size: 4096x4096
Varying densities
• Image size: 4096x4096
Object identification with shapes
• Two spiral curves
Object identification with shapes
Object identification with shapes
• Stars
Object identification with shapes
• Stars
Applications
• Object tracking with radar signal
Conclusion
• An efficient GPGPU implementation for
CCL
• Data-parallelism of GPU exploited
• Experimental results show its efficiency
• Can be successfully applied to various
applications with large-scale images
– e.g., Object identification from radar signals
감사합니다.
ありがとうございます
谢谢
Thank you
Q & A

Fast CCL(connected component labeling) with GPU