Real-Time 3-D Wavelet Lifting
David Barina Pavel Zemcik
Faculty of Information Technology
Brno University of Technology
3-D Wavelet Lifting
LLL
HHH
LLH
LHL
LHH
HLL
HLH
HHL
1-D Wavelet Lifting
α
β
γ
δ
˜P(z) =
1 α 1 + z−1
0 1
1 0
β (1 + z) 1
1 γ 1 + z−1
0 1
1 0
δ (1 + z) 1
ζ 0
0 1/ζ
Naive Approaches
Naive Horizontal
foreach dimension do /* X, Y, Z axis */
foreach lifting do
foreach sample do
step;
end
end
end
set offset
LSB
tag
MSB
Comparison: Strides
20.0ns
40.0ns
60.0ns
80.0ns
100.0ns
120.0ns
140.0ns
160.0ns
180.0ns
200.0ns
220.0ns
1.0k 10.0k 100.0k 1.0M 10.0M 100.0M 1.0G
time/voxel
voxels
unchanged stride
prime stride
Naive Approaches
Naive Vertical
foreach dimension do /* X, Y, Z axis */
foreach sample do
foreach lifting do
step;
end
end
end
huge amount of cache misses
three passes through the data
Comparison: Naive Approaches
20.0ns
40.0ns
60.0ns
80.0ns
100.0ns
120.0ns
140.0ns
160.0ns
0.0 50.0M 100.0M 150.0M 200.0M 250.0M
time/voxel
voxels
horizontal
vertical
2-D Approach
2-D Slices
foreach slice do
foreach sample do
foreach lifting do step; /* X axis */
foreach lifting do step; /* Y axis */
end
end
/* Z axis */
foreach sample do
foreach lifting do step;
end
42 core with SIMD
Comparison: Slices
0.0 s
20.0ns
40.0ns
60.0ns
80.0ns
100.0ns
120.0ns
140.0ns
160.0ns
0.0 50.0M 100.0M 150.0M 200.0M 250.0M
time/voxel
voxels
naive horizontal
naive vertical
slices
3-D Approach
True 3-D
foreach sample do
foreach lifting do step; /* X axis */
foreach lifting do step; /* Y axis */
foreach lifting do step; /* Z axis */
end
23 cube
43 with SIMD
3-D Single-Loop Approach
x
y
z
buffer x
buffer y
buffer z
Overall Comparison
0.0 s
20.0ns
40.0ns
60.0ns
80.0ns
100.0ns
120.0ns
140.0ns
160.0ns
0.0 50.0M 100.0M 150.0M 200.0M 250.0M
time/voxel
voxels
naive horizontal
naive vertical
core 42
core 23
core 43
Conclusions
Intel Core2 AMD Opteron
method time speedup time speedup
naive horiz. 159.8 1.0 105.7 1.0
naive vert. 100.1 1.6 73.5 1.4
core 42 53.8 2.9 41.0 2.5
core 23 23.3 6.8 21.7 4.7
core 43 13.5 11.7 12.9 8.0
core = streaming unit
CPU cache friendly = single-loop approach
SIMD friendly

Real-Time 3-D Wavelet Lifting