1. IEEE 2005 Workshop on Signal Processing Systems (SIPS'05), November 2, Athens, Greece
Flexible Hardware Architecture for 2-D
Separable Convolution-based Scaling
There is not a single scaling technique that suites all kind of images (photo, CAD, Text...) the user is willing to print
or display. Formally, any convolution-based scaling operation can be decomposed in three steps: an anti-aliasing
filter, image reconstruction by continuous convolution and re-sampling to the final grid. Based on this, we propose
a flexible hardware-friendly discrete convolution engine operating a memory that stores a programmable 2D-
separable interpolation kernel. We also state a technique for optimizing the memory size given the kernel and the
scale factor. Finally, we describe a novel flexible filter that overcomes aliasing artifacts regardless of image
frequency content.
Jordi Arnabat and Francisco Cardells
Hewlett-Packard, Large-Format Technology Lab, Barcelona, Spain
{jordi.arnabat, francisco.cardells}@hp.com
Image Scaling
. Not a unique interpolation technique to
achieve good IQ for all types of images:
adaptable HW is the key to survival.
. Formally, scaling can be thought of as:
continuous reconstruction of the discrete
input and re-sampling at the output grid.
. Propose a flexible hardware built from
a classical convolution-based scaler,
where IQ is chosen by means of a
programmable kernel.
Filtering Stage
0 1 2
1 2 3 4
Y1
5 6 7 8
w2
1
w2
2
Y2
A B
1 2 3
A
B
Y1
Digital
interpolator
low pass
filter
low pass
filter
3 4 5
4 5 6
A
B
Y2
Digital
interpolator
low pass
filter
low pass
filter
0
Scaler Data Flow
. Downscaling implies a pre-filtering step
to remove frequencies not
representable in the output grid.
(aliasing)
(a) Moving Average
(b) Frequency-Sharpened CIC
(c) Multistage CIC
. Propose architecture to enable (a) & (b)
pre-filters.
. Wide range of interpolation
techniques: NN, bilinear, bicubic,
gaussian, …, yours!
. Complexity/latency of the hardware
is determined by the interpolation
function support.
. Resampling by means of shift-
variant FIR filter, of length = kernel
support
. Kernel shape can be programmed in
a memory by means of a LUT,
sampled at .
. As a design rule, any kernel shape
needs twice as many samples per
interval as the maximum scale
factor.
. For example, a scaler performing up
to 32x, using a 4 tabs support
kernel, 8-b word precision requires a
2Kb LUT. The datapath for this
interpolator requires 2.2 kgates.
bilinear
nearest neighbor
4
x
Interpolation
Pre-filtering
Conventional Scaling uses a hardwired set
of rules for upscaling and another for
downscaling.
Instead we build any scaling operation as a
flexible prefiltering + interpolation
this flexibility is required as there is not a
single best scaling algorithm for all kind of
images
Programmable Low-pass
FIR filter.
Cut-off frequency given by
downscale factor
Programmable
Continuous
Convolution
up-scaling
down-scaling
nearest neighbor bicubic
Interpolation, Kernel
Sampling
w
-1
w
0
w
1
w
2
W
W24W23W22W21
W14W13W12W11
...
Neighbor index ()
Programmable Interpolation Kernel
1/(1-Z -1 ) 1/(1-Z -1 )
1/(1-Z -1 ) R
R
(1-Z -1 ) 2
3
(1-Z -1 ) (1-Z -1 )
1/(1-Z -1 ) 1/(1-Z -1 )
1/(1-Z -1 ) R
R
(1-Z -1 ) 2
3
(1-Z -1 ) (1-Z -1 )
(a)
(b)
(a)
(b)
(c)
Down-scaling by a factor of 1.5 after (a) moving average and (b) frequency sharpened CIC
filter. Artifacts circled and images resized to aid direct comparison.
(a)
(b)
(original)
Frequency response of three different pre-filtering
schemes. (a) & (b) are combined into one flexible
architecture.
(a) Nearest neighbor (b) Bilinear interpolation
(c) B-spline order 3 (d) Keys’ bicubic a=-1/2 Interpolation by continuous convolution. Principles of operation.
k*D2
1 2 3 n
1 2 k
w
1
w
2
o[k]
Shape of the Interpolation kernel is sampled at a
given frequency (). Data (weights) is stored LUT-
wise in a memory.
. In down-scaling the low-pass filter does
not have to be applied to all the
incoming pixels.
. Instead only the base points for the
interpolation are pre-computed to
remove the aliasing frequencies.
. There must be a number of equivalent
serial low-pass filters equal to the
kernel support.