Wavelet Based Image Compression Using FPGA

1
MOHIEDDIN MORADI
mohieddinmoradi@gmail.com
DREAM
IDEA
PLAN
IMPLEMENTATION

Measure of Information
• Random variable: X
• Self information : If the outcome of a random variable X is ai with
probability p(ai), then the self information is given by
• Entropy : The entropy H(X) of a random variable X with the given alphabet
AX and the probabilities pX is defined by

Distortion Measures
• Let x, y be signals, each consisting of n values with a possible range of [0,
xmax] (e.g. [0, 255] for 8 bit images), then:

Downsampling, Upsampling, and Delay
• Downsampling a sequence x by md can be expressed as yn = xn·md
• Upsampling a sequence x by mu can be expressed as
• The downsampling and upsampling operation for signal x are denoted by x
↑ md and x ↓ mu, respectively.
• Consider the sequence y defined by yn = xn−mdly , that is the signal x
delayed by mdly.
• In z-transform domain this can be expressed as

Wavelets
• Wavelets (little waves) are functions that are concentrated in time as well as
in frequency around a certain point
• we mostly mean a pair of functions:
 scaling function φ
 wavelet function Ψ
• The self similarity (refinement condition) of the scaling function φ is
bounded to a filter h and is defined by
• which means that φ remains unchanged if you filter it with h, downsample
it by a factor of two, and amplify the values by , successively2

Refinement condition of the scaling function – In step a the scaling function is
duplicated, translated and scaled in abscissa. In step b the translated and scaled
duplicates are amplified.
Wavelets

• The wavelet function Ψ is built on φ with help of a filter g
• φ and Ψ are uniquely determined by the filters h and g
Wavelets

• φ and Ψ are uniquely determined by the filters h and g
• Variants of these functions are defined, which are translated by an integer,
compressed by a power of two, and usually amplified by a power of
• j denotes the scale – the bigger j the higher the frequency and the thinner the
wavelet peak
• l denotes the translation – the bigger l the more shift to the right, and the
bigger j the smaller the steps
2
Wavelets

Discrete Wavelet Transforms
• The goal is to represent signals as linear combinations of
- wavelet functions at several scales
and
- scaling functions of the widest required scale (i.e. j=J)
• The coefficient c1−J,l and coefficient dj,l for −J < j ≤ 0 describe the transformed
signal we want to feed into a compression routine.
• J corresponds to the number of different scales we can represent, which is
equal to the number of transform levels.
• The bigger J the more coarse structures can be described. course
J=4
j=-3, c-3,l , d-3,l
j=-2, c-2,l , d-2,l
j=-1, c-1,l , d-1,l
j=0, c0,l , d0,l

A basis consisting of scaling and wavelet functions of the CDF(2,2) wavelet – This basis covers
three levels of wavelet functions. Only a finite clip of translates is displayed

 For even l, cj,l depends only on hk and gk with even k
 For odd l, cj,l depends only on hk and gk with odd k.
• This is the reason why we will split both g and h in its even and odd
indexed coefficients for most of our investigations.
• It is easy to see that the conversion from wavelet coefficients to signal
values is possible without knowing φ or Ψ , the only information needed,
are the filters which belong to them.
• Under certain conditions, the same is true for the reverse conversion.
• It allows us to limit our view to the filters g and h and hide the functions .
• Thus the computation of this change in representation can be made with the
use of filters.

• wavelet analysis or wavelet decomposition : conversion from the original
signal to the wavelet coefficients
• wavelet synthesis or wavelet reconstruction : conversion from the wavelet
coefficients back to the signal or an approximated version of it.
• the analysis scaling and wavelet functions
• the synthesis scaling and wavelet functions
• The corresponding filters are denoted accordingly by
and 
: :
and 
,g and h
: :
,g and h

One level of wavelet transform expressed using filters
Low pass
Low pass
High pass
High pass
as coarse version
of the signal x at
half resolution
the differences or details that
are necessary to reconstruct the
original signal x from the
coarse version.
to discard even
and odd indices
after filtering

tree levels of wavelet transform
Level 2 j=-2, c-2,l , d-2,l
Level 1 j=-1, c-1,l , d-1,l
Level 0 j=0, c0,l , d0,l

Cohen-Daubechies-Feauveau CDF(2,2) Wavelet
(biorthogonal (5,3) wavelet)
• Filter length of 5 and 3 for the low and high pass filters.
• The filters, as well as the scaling and wavelet functions for decomposition
and reconstructions, are symmetric.
• A symmetric filter f always has odd filter length and it holds that
fa+k = fb−k
a and b are the smallest and greatest index l, respectively, where fl is
different from zero.
Symmetry is a very important property if we consider image compression,
because in the absence of symmetry artifacts are introduced around edges.

Cohen-Daubechies-Feauveau CDF(2,2) Wavelet
(biorthogonal (5,3) wavelet)

The relation between the regularity of the synthesis wavelet and the
number of vanishing moments of the analysis wavelet
A biorthogonal wavelet has m vanishing moments if and only if its dual scaling
function generates polynomials up to degree m.
In other words, vanishing moments tend to reduce the number of significant wavelet
coefficients and thus, one should select a wavelet with many of them for the
analysis
On the other hand, regular or smooth synthesis wavelets give good approximations, if
not all coefficients are used for reconstruction, as it is the case for lossy
compression.
To increase the number of vanishing moments of the decomposition wavelet one has to
enlarge the filter length of the corresponding analysis low and high pass filters. That
is, you have a trade off between filter length and number of vanishing moments of
the decomposition wavelet.
In terms of image compression you can improve the compression performance at the
expense of increasing computational power to calculate the filter operations.
.

Lifting Scheme
• An alternative computation method of the discrete wavelet transform.
• In order to be consistent we base our introduction to Lifting on the
CDF(2,2) wavelet, which is taken as explanation example too.
The Lifting Scheme is composed of three steps, namely:
1- Split (also called Lazy Wavelet Transform)
(to split the input signal x into even and odd indexed samples)
2- Predict
(to predict the odd samples based on the evens)
3- Update
(to ensure that the average of the signal is preserved)

even
odd
The odd samples are replaced by
the old ones minus the prediction.
detail coefficients
the coarser version of the input
sequence at half the resolution
approximation
to ensure that the average
of the signal is preserved
Lifting Scheme

With z-transform notation we could express the split and merge step using
downsampling, delay, and upsampling, respectively.
Lifting Scheme

The predict and update steps for the CDF(2,2) wavelet
• Here the predictor is chosen to be linear.
• In that case all detail coefficients will be zero.
• Therefore we have
What are the advantages of this method ?
1. The most important fact is that we do not throw away already computed coefficients as in the
filter bank approach.
2. It is also remarkable, that the wavelet transform can now be computed in place. This means,
that given a finite length signal with n samples we need exactly n memory cells, each of them
capable to store one sample, to compute the transform.
3. Furthermore we reduce the number of operations in order to compute the coefficients of the
next coarser or finer scale, respectively. For the CDF(2,2)-wavelet we save three operations
using the Lifting Scheme in comparison with the traditional filter bank approach.
for the prediction step
for the update step

Integer-to-Integer Mapping
• Obviously, the application of the filter bank approach or the Lifting Scheme leads to
coefficients, which are not integers in general. In the field of hardware image compression it
would be convenient, that coefficients and the pixel of the reconstructed image are integers
too.
• For the special case of the CDF(2,2) wavelet we therefore use the prediction and update steps
as follows
• As a consequence the coefficients of all scales −J < j ≤ 0 can be stored as integers and for all
operations integer arithmetic is sufficient. Note, that the coarser the scale the more bits are
necessary to store the corresponding coefficients. To overcome the growing bitwidth at
coarser scales modular arithmetic can be used in the case of lossless compression.

Lifting for for the CDF(2,2) wavelet
Scheme after Integer-to-Integer Mapping

Wavelet transforms on images
• To transform images we can use two dimensional wavelets or apply the one
dimensional transform to the rows and columns of the image successively
as separable two dimensional transform.
images interpretation as two dimensional array I

: The image pixels
:The wavelet transformed image
:The coefficients
Wavelet transforms on images

Reflection at Image Boundary
A row of an image r = (r0, . . . , rN−1)
In order to convolve such a row r with a filter f we have to extend it to infinity
in both directions.
Extended row

Reflection at Image Boundary
There are several choices to choose the values of r’k from outside the interval
[0,N −1].
The most popular one’s are
• padding with zeros,
• periodic extension, or
• symmetric extension.

2D-DWT
After one level of transform we obtain N/2 coefficients c0,l and N/2 coefficients
d0,k
These are given in interleaved order, that is
Because of the split in odd and even indexed positions in the Lifting Scheme.
Usually is rearranged to

• Since we have restricted the images to be of quadratic size
we can perform at most l = log2 N levels of transform.
2D-DWT

• In order to preserve the average of a one dimensional signal, or the average
of the brightness of images, we have to consider the normalization factors
after the wavelet transform has taken place
Normalization factors of the CDF(2,2) wavelet in two dimension for each level l, 0<= l < 5
2D-DWT

Lena low + high-pass subsampled

1-level 2-D wavelet decomposition

LEVEL 3
LEVEL 2
LEVEL 1
LEVEL 0

Diagrammatic representation of
the dyadic decomposition for three
decomposition levels

The in-place mapping scheme.
The dyadic decomposition is
applied on a hypothetical 8 8
original image.

State of the art Image Compression
Techniques

• In 1993, Shapiro has presented an efficient method to compress wavelet
transformed images (embedded zerotree wavelet (EZW) encoder)
• This EZW encoder exploits the properties of the multi-scale representation.
• An significant improvement of this central idea was introduced by Said and
Pearlman in 1996 (Set Partitioning In Hierarchical Trees (SPIHT)
(pronounced: spite)
Shapiro’s Algorithm (EZW)

A Multi-resolution Analysis Example
Lower octave has higher
resolution and contains higher
frequency information

Parent-Child Relationship of the LL Subband

Tree Structure of Wavelet Coefficients
Parent:
– Coefficient at the coarse scale is
called parent
Children:
– All coefficients corresponding to
the same spatial location at the
next finer scale of similar
orientation
Descendants:
– For a given parent, the set of all
coefficients at all finer scale of
similar orientation, corresponding
to the same location
• Every coefficient at a given scale can be related to a set of coefficients at
the next finer scale of similar orientation.
Parent
Children

Hierarchical trees in multi-level decomposition

coefficients that are in the same spatial
location consist of a quad-tree.

• E – The EZW encoder is based on progressive encoding.
Progressive encoding is also known as embedded encoding
• Z – A data structure called zero-tree is used in EZW algorithm
to encode the data
• W – The EZW encoder is specially designed to use with
wavelet transform. It was originally designed to operate on
images (2-D signals)

• A kind of bitplane coding.
• The kth bits of the coefficients constitute a bitplane.
• A bitplane encoder starts coding with the most significant bit of each
coefficient.
• Within a bitplane the bits of the coefficients with largest magnitude come
first.
The coefficients are shown in decreasing order from left to right.
Each coefficient is represented with eight bit, where the least significant bit is in front.
N
MSB
LSB
M
Sign
coding
order

• self similarities between different scales which result from the recursive
application of the wavelet transform step to the low frequency band

• Shapiro proposes to scan the samples from left to right and from top to
bottom within each sub band, starting in the upper left corner.
• The sub bands LH, HL, and HH at each scales are scanned in that order.
• Furthermore, in contrast to traditional bitplane coders he has introduced
data dependent examination of the coefficients.
• The idea behind is, if there are large areas with unimportant samples in
terms of compression, they should be excluded from exploration. The
addressed self similarities are the key to perform such exclusions of large
areas.

• In order to exploit the self similarities
during the coding process, oriented
trees of outdegree four are taken for the
representation of a wavelet transformed
image.
• Each node of the trees represents a
coefficient of the transformed image.
• The levels of the trees consist of
coefficients at the same scale.
• The trees are rooted at the lowest
frequency subbands of the
representation.
Each coefficient in the LH, HL, and HH subbands of each scale has four children
The coefficients at the highest frequency subbands have no children
There is only one coefficient in the lowest frequency band (DC coefficient) that has three children
Oriented quad tree’s, four transform level, N = 16

A coefficient of the wavelet transformed image is insignificant with respect to a threshold th if its
magnitude (|c|) is smaller than
Otherwise it is called significant with respect to the threshold th.
 Dominant pass
In the dominant pass, the coefficients are scanned in raster order (from left to right and from top
to bottom) within the quadrants. The scan starts with the quadrants of the highest transform
level.
• In each transform level the quadrants are scanned in the order HL, LH, and HH. The
coefficients are coded by symbol P, N, ZTR, or IZ.
A coefficient is coded by:
P, if it is greater than the given threshold and is positive
N, if its absolute value is greater than the given threshold and it is negative
ZTR, if its absolute value is smaller than the given threshold and the absolute value of
all coefficients in the corresponding quad tree are smaller than the threshold, too. (zero
tree root)
IZ, if its absolute value is smaller than the given threshold and there exists at least one
coefficient in the corresponding quad tree that is greater than the given threshold with
respect to the absolute value.(isolated zero)
significantinsignificant

Z :
 It is useed within the high frequency bands of level one only, because all
coefficients in these quadrants could not be root of a zerotree.
 It can thus be seen as the combination of ZTR and IZ for this special case.
Once a coefficient is encoded as the symbol P or N it is not included in the
determination of zerotrees.
 Subordinate pass:
• Each coefficient, that has been coded as P or N in the previous dominant pass,
is now refined, while coding the th-bit of its binary representation.
• This corresponds to a bitplane coding, where the coefficients are refined in data
dependent manner.
• The most important fact hereby is, that no indices of the coefficients under
consideration have to be coded. This is done implicitly, due to the order in
which they become significant (coded as P or N in the dominant pass).

Example: encoding the wavelet
transformed image given in Figure using
the embedded zerotree wavelet algorithm
of Shapiro
the subband HL is excluded and the subband LH is scanned only partially.

The EZW algorithm is based on two observations:
– Natural images in general have a low pass
spectrum. When an image is wavelet
transformed, the energy in the sub-bands
decreases with the scale goes lower (low scale
means high resolution), so the wavelet
coefficient will, on average, be smaller in the
lower levels than in the higher levels.
– Large wavelet coefficients are more important
than small wavelet coefficients.
631 544 86 10 -7 29 55 -54
730 655 -13 30 -12 44 41 32
19 23 37 17 -4 –13 -13 39
25 -49 32 -4 9 -23 -17 -35
32 -10 56 -22 -7 -25 40 -10
6 34 -44 4 13 -12 21 24
-12 -2 -8 -24 -42 9 -21 45
13 -3 -16 -15 31 -11 -10 -17
typical wavelet coefficients
for a 8*8 block in a real image
EZW – basic concepts
higher levels

The observations give rise to the basic progressive coding idea:
1. We can set a threshold T, if the wavelet coefficient is larger than T, then
encode it as 1, otherwise we code it as 0.
2. ‘1’ will be reconstructed as T (or a number larger than T) and ‘0’ will be
reconstructed as 0.
3. We then decrease T to a lower value, repeat 1 and 2. So we get finer and
finer reconstructed data.
The actual implementation of EZA algorithm should consider :
1. What should we do to the sign of the coefficients. (positive or negative) ?
– answer: use POS (P) and NEG (N)
2. Can we code the ‘0’s more efficiently? -- answer: zero-tree
3. How to decide the threshold T and how to reconstruct? –answer: see the
algorithm

coefficients that are in the same
spatial location consist of a
quad-tree.
• The definition of the zero-tree:
There are coefficients in different subbands that represent the same spatial
location in the image and this spatial relation can be depicted by a quad tree
except for the root node at top left corner representing the DC coeeficient
which only has three children nodes.
• Zero-tree Hypothesis
If a wavelet coefficient c at a coarse scale is insignificant with respect to a
given threshold T, i.e. |c|<T then all wavelet coefficients of the same
orientation at finer scales are also likely to be insignificant with respect to T.

First step:
The DWT of the entire 2-D image will be computed by FWT
Second step:
Progressively EZW encodes the coefficients by decreasing the threshold
Third step:
Arithmetic coding is used to entropy code the symbols

Here MAX() means the maximum coefficient value in the image and y(x,y) denotes the
coefficient. With this threshold we enter the main coding loop
threshold = initial_threshold;
do {
dominant_pass(image);
subordinate_pass(image);
threshold = threshold/2;
} while (threshold > minimum_threshold);
The main loop ends when the threshold reaches a minimum value, which could be
specified to control the encoding performance, a “0” minimum value gives the
lossless reconstruction of the image.
The initial threshold t0 is decided as:
Second step:
Progressively EZW encodes the coefficients by decreasing the threshold

In the dominant_pass
• All the coefficients are scanned in a special order
• If the coefficient is a zero tree root, it will be encoded as ZTR. All its
descendants don’t need to be encoded – they will be reconstructed as
zero at this threshold level
• If the coefficient itself is insignificant but one of its descendants is
significant, it is encoded as IZ (isolated zero).
• If the coefficient is significant then it is encoded as POS (P) or NEG
(N) depends on its sign.
This encoding of the zero tree produces significant compression because gray level images
resulting from natural sources typically result in DWTs with many ZTR symbols. Each
ZTR indicates that no more bits are needed for encoding the descendants of the
corresponding coefficient

At the end of dominant_pass
• all the coefficients that are in absolute value larger than the current
threshold are extracted and placed without their sign on the subordinate
list and their positions in the image are filled with zeroes. This will
prevent them from being coded again.
In the subordinate_pass
• All the values in the subordinate list are refined. this gives rise to some
juggling with uncertainty intervals and it outputs next most significant
bit of all the coefficients in the subordinate list.

Wavelet coefficients for a 8*8 block
EZW – an example

The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
Data without any symbol is a node in the zero-tree.
63
POS
-34
NEG
49
POS
10
ZTR
7
IZ
13
IZ
-12 7
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
IZ
4
IZ
6 -1
15
ZTR
14
IZ
3 -12 5 -7 3 9
-9
ZTR
-7
ZTR
-14 8 4 -2 3 2
--5 9 -1
IZ
47
POS
4 6 -2 2
3 0 -3
IZ
2
IZ
3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
EZW – an example

The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
Data without any symbol is a node in the zero-tree.
63
POS
-34
NEG
49
POS
10
ZTR
7
IZ
13
IZ
-12 7
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
IZ
4
IZ
6 -1
15
ZTR
14
IZ
3 -12 5 -7 3 9
-9
ZTR
-7
ZTR
-14 8 4 -2 3 2
--5 9 -1
IZ
47
POS
4 6 -2 2
3 0 -3
IZ
2
IZ
3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
Significance Map
EZW – an example

The result from the dominant_pass is output as the following:
D1: POS, NEG, IZ, ZTR, POS, ZTR, ZTR, ZTR, ZTR, IZ, ZTR, ZTR, IZ, IZ, IZ, IZ, IZ, POS, IZ, IZ
The significant coefficients are put in a subordinate list and are
refined. A one-bit symbol is output to the decoder.
Original data 63 34 49 47
Output symbol (S1) 1 0 1 0
Reconstructed data 56 40 56 40
For example, the output for 63 is:
sign 32 16 8 4 2 1
0 1 1 ? ? ? ?
If T+.5T is less than data item take the average of 2T and 1.5T, put a 1 in the code.
So 63 will be reconstructed as the average of 48 and 64 which is 56.
If it is more than T+.5T , put a 0 in the code and encode this as T+.5T+.25T.
Thus, 34 is reconstructed as 40.
T+.5T=48
(2T+1.5T)/2=56
T+.5T+.25T=40
EZW – an example

* * * 10 7 13 -12 7
-31 23 14 -13 3 4 6 -1
15 14 3 -12 5 -7 3 9
-9 -7 -14 8 4 -2 3 2
--5 9 -1 * 4 6 -2 2
3 0 -3 2 3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
After dominant pass, the significant coefficients will be replaced by * or 0
Then the threshold is divided by 2, so we have 16 as current threshold
Significance Map
EZW – an example

The result from the second dominant pass is output as the following:
D2: IZ, ZTR, NEG, POS, IZ,IZ, IZ, IZ, IZ, IZ, IZ, IZ
The significant coefficients are put in the subordinate list
and all data in this list will be refined as:
Original data 63 34 49 47 31 23
Output symbol 1 0 0 1 1 0
Reconstructed data 60 36 52 44 28 20
sign 32 16 8 4 2 1
0 1 1 1 ? ? ?
The computatin is now extended with respect to the next significant bit. So 63 will be
reconstructed as the average of 56 and 64 –- 60!
EZW – an example

The process is going on until threshold =1, the final output as:
p=pos, n=neg, z=iz, t=ztr , Di= i’th dominant pass, Si= i’th dominant pass output symbol
D1: pnztpttttztttttttptt
S1: 1010
D2: ztnptttttttt
S2: 100110
D3: zzzzzppnppnttnnptpttnttttttttptttptttttttttptttttttttttt
S3: 10011101111011011000
D4: zzzzzzztztznzzzzpttptpptpnptntttttptpnpppptttttptptttpnp
S4: 11011111011001000001110110100010010101100
D5: zzzzztzzzzztpzzzttpttttnptppttptttnppnttttpnnpttpttppttt
S5: 10111100110100010111110101101100100000000110110110011000111
D6: zzzttztttztttttnnttt
sign 32 16 8 4 2 1
0 1 1 1 1 1 1
So 63 will be reconstructed as 32+16+8+4+2+1=63!
Note, how progressive transmission can be done.
EZW – an example

112
1
-2
5
-2
-2
8
3 23
3-430
6264
4
1
02 -3
45
-14
1069-38
27-31
19 17 3 -15
743
8-20156
6
-4 7 -3
3 0 -2
-8 -1 -1 -64
5
7-19-1-9
5
13
113 4
-9
-2
-2
8
3 23
-14
1069-38
743
8-20156
55
13
-9
-2
3-430
6264
4
1
02 -3
45
27
3 -15
7-19
112
1
5
-31
19 17
6
-4 7 -3
3 0 -2
-8 -1 -1 -64
-1-9
113 4
SPIHT- Set Partitioning In Hierarchical Trees

Said and Pearlman have significantly improved the codec of Shapiro.
The main idea is based on partitioning of sets, which consists of coefficients or
representatives of whole subtrees.
They classify the coefficients of a wavelet transformed image in three sets:
 LIP: list of insignificant pixels which contains the coordinates of those
coefficients which are insignificant with respect to the current threshold th.
 LSP: list of significant pixels which contains the coordinates of those coefficients
which are significant with respect to th.
 LIS: list of insignificant sets which contains the coordinates of the roots of
insignificant subtrees.
During the compression procedure, the sets of coefficients in LIS are refined and if
coefficients become significant they are moved from LIP to LSP.

• The first difference to Shapiro’s EZW algorithm is the distinct definition of
the significance. Here, the root of the tree is excluded from the computation
of the significance attribute.
Set O(i, j), D(i, j) and L(i, j):
offspring

Ex: the sets O(i, j), D(i, j) and L(i, j) with i = 1 and j = 1, the labels O, D, L
show, that this coefficients is a member of the corresponding set, (N = 8)
H

TYPE A
TYPE B

In the SPIHT algorithm the signification is computed for the sets D(i, j) and L(i, j).
The root of each quadtree is, in contrast to the algorithm presented by Shapiro, not included in
the computation of the significance.
Significance Attribute

Parent-Child Relationship of the LL Subband
different parent-child relationships of the LL band

O(i,j): set of coordinates of all offspring of node (i,j); children only
D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc.
L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc.
SPIHT Sorting Pass

O(i,j): set of coordinates of all offspring of node (i,j); children only
D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc.
H (i,j): set of all tree roots (nodes in the highest pyramid level); parents
L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc.
SPIHT Refinement Pass
The refinement phase outputs the kth bit
of each elements of list LSP, if it was
not included in the last sorting phase.

Example of SPIHT
All roots D’s of roots

n=4;
for n=3

n=2;

5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2
Example of 3-scale wavelet transform of an 8 by 8 image.

0) Initialization
LSP←{}
LIP ←{(63,-34,-31,23)}
LIS ←{(-34,-31,23)}
1) n=5 (T=25=32)
LSP←{(63,-34,49,47)}
LIP ←{(-31,23),(10,14,-13),
(15,14,-9,-7),(-1,-3,2)}
LIS ←{(23),-34B,(15,-9,-7)}
output:5,1+1-0011+000010000100101+0000
decoding
0
00
00
00
00
00
0
0 0
0
0
00
320
0
00
0
00
00
32 -32
0
0
00
032
0
00
00
00
0
0 0
0
0
0
00
00
00
0
0 0
0 0
00
00
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2

2) n=4 (T=24=16)
LSP←{(63,-34,49,47),(-31,23)}
LIP ←{(10,14,-13),(15,14,-9,-7),(-1,-3,2)}
LIS ←{(23),-34B,(15,-9,-7)}
output:
sorting pass:
4,1-1+000000000000000
refinement pass: 1010
decoding
0
00
00
00
00
00
0
0 0
0
0
00
320
0
00
0
00
00
48 -32
-16
0
00
048
16
00
00
00
0
0 0
0
0
0
00
00
00
0
0 0
0 0
00
00

3) n=3 (T=23=8)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9)}
LIP ←{(-7),(-1,-3,2),(3),(-5,3,0),(2,-3,5),(7,3,4),(7,6,-1),(3,3,2)}
LIS ←{(-7),23B,(14)}
output:
sorting pass:
3,1+1+1-1+1+1-0000101-1-1+01
101+0010001+0101+0011-0000
101+00
refinement pass: 100110
0
00
0-8
00
80
00
0
0 8
0
0
00
400
0
80
0
00
00
56 -32
-24
8
-88
848
16
8-8
-80
0-8
8
0 0
0
0
0
00
00
00
0
0 8
0 0
00
00
decoding

4) n=2 (T=22=4)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4)}
LIP ←{(-1,-3,2),(3),(3,0),(2,-3),(3),(-1),(3,3,2),(-2),(3,-2),(-2,2,0),
(3,0,3),(3)}
LIS ←{}
output:
sorting pass:
2,1-00001-00001+1+01+1+1+0000
11+1-1+1+111+1-1+011+1+00100
01+101+00101+1-1+
refinement pass:
10011101111011000110
4
04
4-12
00
80
04
-4
-4 8
0
0
00
440
0
84
0
44
-44
60 -32
-28
12
-1212
848
20
8-12
-120
-4-8
12
4 4
0
0
0
4-4
40
00
4
4 12
0 4
40
00
decoding

5) n=1 (T=21=2)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4),
(-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3)}
LIP ←{(-1),(0),(-1),(0),(0)}
LIS ←{}
output:
sorting pass:
1,01-1+1+1+01+1-1+01+1+1+1-1+
1-1-1+01+01+1+
refinement pass:
1101111101100100100010010
1110010100101100
decoding
4
06
6-12
22
82
-24
-6
-4 8
2
2
2-2
460
0
104
-2
64
-46
62 -34
-30
14
-1214
1048
22
8-14
-122
-6-8
14
4 6
2
2
-2
4-4
62
20
6
6 12
2 4
40
2-2

6) n=0 (T=20=1)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4),
(-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3),(-1,-1)}
LIP ←{(0),(0),(0)}
LIS ←{}
output:
sorting pass:
0,1-01-00
refinement pass:
1011110011010001110111110
1000101100000000101101111
001000111
decoding
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2

The basic Algorithm
LIP: List of Insignificant Pixel
LIS: List of Insignificant Set
LSP: List of Significant Pixel
All roots
D’s of roots

What are the advantages and disadvantages of this partitioned approach?
+ The internal memory requirements are dramatically reduced.
+ The subimages can be transformed independently of each other.
+ The traditional 2D-DWT can be directly applied to the subimages without any modifications.
− It is not applicable for lossy compression.

Boundary Treatment
which pixels have to be considered in order to compute the low and high pass coefficients
using the CDF(2,2)?
wavelet partitioned wavelet transform without boundary treatment introduces block
artifacts targeting to lossy compression

the computation of the low pass coefficients and their dependences on the coefficients of the previous scales
area of coefficients, which have to be incorporated into the computation of the transform of the partition under consideration

Modifications to the SPIHT Codec
In this section we present the modifications necessary to obtain an efficient
hardware implementation of the SPIHT compressor based on the
partitioned approach to wavelet transform images.
 At first, we exchange the sorting phase with the refinement phase to save
memory for status information.
 The greatest challenge is the hardware implementation of the three lists
LIP, LSP, and LIS.

Exchange of Sorting and Refinement Phase
In the basic SPIHT algorithm status information has to be stored for the
elements of LSP specifying whether the corresponding coefficient has been
added to LSP in the current iteration of the sorting phase (see Line 20).
In the worst case all coefficients become significant in the same iteration.
Consequently, we have to provide a memory capacity of q^2 bits to store
this information.
However, if we exchange the sorting and the refinement phase, we do not need
to consider this information anymore.
The compressed data stream is still decodable and it is not increased in size.
Of course, we have to consider that there is a reordering of the transmitted bits
during an iteration.

Memory Requirements of the Ordered Lists
To obtain an efficient realization of the lists LIP, LSP, and LIS, we first have to specify
the operations that take place on these lists and deduce worst case space
requirements in a software implementation.
Estimations for LIS
We have to provide the following operations for LIS:
• initialize as empty list (Line 0),
• append an element (Line 0 and 19),
• sequentially iterate the elements (Line 5),
• delete the element under consideration (Line 15 and 19),
• move the element under consideration to the end of the list and change the type (Line
14).

LIS contains at most elements.
To specify the coordinates of an element, 2 log2 (q/2) bits are needed.
A further bit to code the type information has to be added per element. This
results in an overall space requirement for LIS of bits.

Estimations for LIP and LSP
Now, let us consider both the list LIP and the list LSP because they can be
implemented together. Again,
we start with the operations applied to both lists:
• initialize as empty list (Line 0),
• append an element (Line 0, 4, 11, and 12),
• sequentially iterate the elements (Line 20),
• delete the element under consideration from LIP (Line 4).
the overall space requirement for both lists is q^2 · 2 log2 q bits.

Prototyping Environment
The prototyping environment provides several
mechanism to exchange data between the
mounted Xilinx device, the local SRAM and the
PC main memory.
These could be categorized into:
1. direct access to registers/latches (configured
into the FPGA)
2. access to memory cells of the local SRAM
(all data transfers are going through the FPGA)
3. DMA transfers and DMA on demand
transfers
4. interrupts.
To communicate with the PCI card one has to write a simple C/C++ program, which initializes the card, configures
the FPGA, set the clock rate and starts the data transfer or the computation.

• The Xilinx XC4085 device
consists of a matrix of
Configurable Logic Blocks
(CLB) with 56 rows and 56
columns.
• These CLBs are SRAM based
and can be configured many
times.
• After shut down the power
supply they have to be
reconfigured.
• It can be configured to represent
any function with 4 or 5 inputs
(function generator F, G, or H).
• there are mainly two flip-flops,
two tristate drivers and the so
called carry-logic available.
The Xilinx XC4085 XLA device

The routing is done using programmable switch matrices (PSM)
(a) interconnections between the CLBs
(b) switch matrices in XC4085XLA
• Each CLB contains hard-wired carry logic to accelerate arithmetic operations.
• Fast adders can be constructed as simple ripple carry adders, using the special carry logic to
calculate and propagate the carry between full adders in adjacent CLBs. One CLB is capable
of realizing two full adders.

• Each CLB can be configured to
provide internal RAM.
• In order to do this, conceive the
four input signals to F and G as
address lines.
• Thus a CLB provides two 16×1 or
one 32×1 random access memory
module, respectively.
• At maximum we have 12kbyte
RAM available.
• Each RAM block can be configured
with different behavior:
•synchronous RAM: edge-triggered
• asynchronous RAM: level-sensitive
• single-port-RAM
• dual-port-RAM

2D-DWT FPGAArchitectures targeting Lossless Compression
In the following all available architectures ( 4 architectures) are listed.
• one Lifting unit
– data transfer from the PC directly to the internal memory and vice versa
– data transfer from the PC to the SRAM on the prototyping board vice versa
• four Lifting units working in parallel
– data transfer from the PC to internal memory and vice versa
– data transfer from the PC to SRAM on the prototyping board vice versa

The implementations with four Lifting units
introduce parallelism with respect to the rows of
a subimage.
Once the FPGA is configured, a whole image
can be transfered to the SRAM of the PCI card
(maximum size 2Mbyte). Each subimage is then
loaded into the internal memory, wavelet
transformed, and the result is stored in the
SRAM where it can be transfered back to the
PC.
Another choice is to write a subimage directly
into the internal memory of the FPGA, start the
transform, and read the wavelet transformed
subimage back to the PC.
The computation itself is started if an internal
status register is set by the software interface.
The RAM itself is further decomposed in
smaller units to support the parallel execution of
the four Lifting units. Its is capable to store a
subimage of size 32×32.
There exists two separate finite state machines
to control the wavelet transform and its inverse,
respectively. the global data path diagram of a circuit with four parallel
working lifting units.

2D-DWT FPGAArchitectures targeting Lossy Compression
we will distinguish between two different architectures for the partitioned
discrete wavelet transform on images.
I. 2D-DWT FPGAArchitecture based on Divide and Conquer Technique
II. 2D-DWT FPGA Pipelined Architecture

2D-DWT FPGAArchitecture based on Divide and Conquer Technique
in order to compute one level of CDF(2,2)-wavelet transform in one dimension we
need two pixels to the left and one pixel to the right of each rows of an image,
respectively.
Thus, we append a row of an image by one and two memory cells on the right and on
the left, respectively. We called such a enlarged row extended row.
Let r be a row of an image of length 16. Now, the problem is split in two subproblems
which are solved interlocked. The first module computes the ’inner’ coefficients c0,0, . . . ,
c0,15 and d0,0, . . . , d0,15 whichonly depend on the extended row. In order to proceed to the
next level of the wavelet transform, the coefficients at appended positions have to be made
topical, i.e., the coefficients c0,−2, c0,−1, and c0,16 have to be computed. The first module
which is responsible for the inner coefficients computes the subtotals of c0,−2, c0,−1, and
c0,16 that only depend on the extended row. The second module computes the subtotals of
c0,−2, c0,−1, and c0,16 that depend on pixels not in row.

2D-DWT FPGA Pipelined Architecture
• two one dimensional DWT units (1D-DWT) for horizontal
and vertical transforms
• a control unit as a finite state machine
• an internal memory block
To process a subimage, all rows are transfered to the FPGA over
the PCI bus and transformed on the fly in the horizontal
1D-DWT unit using pipelining. The coefficients computed
in this way are stored in internal memory of different types.
The coefficients corresponding to the rows of the subimage
itself are stored in single port RAM.
Now the vertical transform levels can take place. This is done by
the vertical 1D-DWT unit.
The control unit coordinates these steps in order to process a
whole subimage and is responsible for generating enable
signals, address lines, and so on.
At the end, the wavelet transformed subimage is available in the
internal RAM. At this point an EZW algorithm can be
applied to the multiscale representation of the subimage.
Since all necessary boundary information was included in the
computation, no block artefacts are introduced by the
following quantization.
control unit

4-level Horizontal 1D-DWT unit
The whole horizontal transform is done for the 16 rows of the subimage under consideration.
In addition, 30 rows of the neighboring subimage in the north and 15 rows of the southern
subimage are transformed in the same manner. These additional computations are required by the
vertical DWT applied next.
This unit has to take four pixels of a row at each clock cycle and must perform 4 levels of
horizontal transforms.
the unit consists of four pipelined stages, one for each transform level
high frequency
coefficients
low frequency
coefficients
high frequency
coefficients
low frequency
coefficients
second stage
F=f0/2
first stage
F =f0
even
even
even
evenodd
odd
odd
odd
input for the third stage
Third stage
Forth stage
alternately outputs a low or a high
frequency coefficient at one clock cycle
alternately outputs a low or a high
frequency coefficient at one clock cycle

Recall:
Lifting for for the CDF(2,2) wavelet Scheme after Integer-to-Integer Mapping

The w-bit input 1D-DWT unit (1i2o)
To implement one level of DWT using the lifting method the following steps are
necessary:
• split the input into coefficients at odd and even positions
• perform a predict-step, that is the operation given in
• perform an update-step, that is the operation given in

w-bit input 1D-DWT unit (accordingly to the lifting scheme)
The unit consists of two register chains.
The registers in the upper chain are enabled at even, the registers in the lower chain at
odd clock edges. This splits the input into words at even and odd positions.
Now the predict and update steps can be applied straightforward.

The 4w-bit input 1D-DWT unit (4i4o)
takes four pixels of the same row (i) at a time
We use this unit for both the first and the second level of the transform

• The internal memory for the wavelet coefficients is capable to store 16×16
coefficients.
• Since the bitwidth of the coefficients differs with their corresponding subbands the
memory block consists of 5 slices.
• InFigure (a) we have shown once again the minimal affordable bitwidth for each
subbands.
• The structure of the internal memory is illustrated in Figure (b).
Internal Coefficient Memory

FPGA-Implementation of the Modified SPIHT Encoder
To store wavelet coefficients
LP := LIP U LSP SL and SD store the pre-computed significance
attributes for all thresholds
block diagram of modified SPIHT compressor

size in bits of each RAM block
In comparison to the algorithm SPIHT Image compression ,modified SPIHT Image
compression could reduce the internal memory from
To
(N = 512, d0 = 11)

• Each subimage of the wavelet transformed image is
transferred once to the internal memory module named
’coeff’ or is already stored there.
• At first, the initialization of the modules representing
LIP, LSP, and LIS and the computation of the
significances is done in parallel.
• The lists LIP and LSP are managed by the module
’LP’, the bitmap of LIS by the module ’LIS’.
• The significances of sets are computed for all
thresholds th ≤ kmax at once and are stored in the
modules named ’SL’ and ’SD’, respectively.
• Here we distinguish between the significances for the
sets L and D.
• With this information the compression can be started
with bit plane kmax.
• Finite state machines control the overall procedure.
• The data to be output is registered in module ’output’
from which it is put to the local SRAM on the PCI card
on a 32 bit wide data bus.
• Additionally, an arithmetic coder can be configured
into that module. This further reduces the compression
ratio.
the overall functionality

Hardware Implementation of the Lists
• To reduce the memory requirement for the list data structures in the worst case, we
implement the lists as bitmaps.
• The bitmap of list L represents the characteristic function of list L.

• The RAM module which realizes LIP and LSP has a configuration of q × q
entries of bit length 2 as for each pixel of the q × q subimage either
(i, j) LIP, (i, j) LSP, or (i, j) LIP LSP holds.
• The second RAM module implements LIS.
   
possible configuration states of a coordinate (i, j)
(0)
(0) (0)
(1)
(1) (1)
(1)
Since none of the coefficients in these
subbands can be the root of a zerotree, we
have to provide a bitmap of size (q/2)^2 bits
Coefficients can
be of type B
Coefficients
always are of
type A
only for the area
which corresponds to
LL(1) the type
information has to be
stored. This results in
additional (q/4)^2 bits

Efficient Computation of Significances
 Computing significance of an individual coefficient: The significance of an individual
coefficient is trivial to compute.
Just select the kth bit of |ci,j | in order to obtain Sk(i, j). This can be realized by using bit
masks and a multiplexer.
 Computing significance of sets for all thresholds in parallel:
We define S*(T ) as
 Thus, S*(T ) stands for the maximum threshold k for which some coefficient in T
becomes significant.
 Once S*(T ) is computed for all sets L and D, we have preprocessed the
significances of sets for all thresholds. In order to do this, we use the two RAM
modules SL and SD. They are organized as following memory, respectively.
SL SD

• The computation is done bottom up in the hierarchy defined by the spatial oriented
trees.
• The entries of both RAMs are initialized with zero.
• Now, let (e, f) be a coordinate with 0 < e, f < q just handled by the bottom up
process and let (i,j)= ([e/2],[f/2]) be the parent of (e, f) if it exists. Then SD and SL
have to be updated by the following process

• After reset we start the computation in state
one and initialize k’max and the row and
column indices e and f.
• At this time SD(e, f) and SL(e, f) hold their
old values from the last subimage under
consideration for all 0 < e, f < q.
• If the enable signal becomes active, we
proceed in state 2. Here we buffer the
present value of SD(e, f).
• In the states 3, 4, 5, and 6 we compute line
(6.1).
• The condition e, f are odd checks, if we
visit a 2 × 2 coefficient block for the first
time.
• The states 2, 5, and 6 are responsible for
computing the maximum of SD(i, j) and
S*(e, f) (line (6.2)), which is buffered in tS.
• State 8 performs the assignment in line
(6.3). Furthermore, this finite state machine
updates the value k’ max = kmax + 1
• for the subimage under consideration.
• In state 10 the low frequency coefficient at
position (0, 0) will be included in this
computation, too.
• The operation in state 9 is done using a
simple subtractor and a combined
representation with interleaved bitorder of
the row and column index e and f, that is
fn−1, en−1, fn−2, en−2, . . . , f1, e1, f0, e0.

Lifting Scheme
s=a+ (b/2)
d=b-a

Hardware platform
The hardware platform used [WILDFORCE] is a PCI plug-in board with five
Xilinx 4085 FPGAs, also referred to as PEs (Processing Elements).
The board is stacked with five 1MB SRAM chips.
Each of the five SRAM chips are directly connected to one of the five PEs.
The embedded memory is accessible for read/write from both the host
computer as well as from the corresponding PE.
Each of the 1MB memory chip is organized as 262144 words of 32 bits each.

Memory read/write
• The input image : 512 by 512 pixels
• Input frames are loaded to the embedded memory by the host computer
and results are read back, once the PE has processed it.
• The PE also uses the embedded memory as intermediate storage to hold
results between different stages of processing.
• Memory reads can be pipelined so that the effects of this latency is
minimized.

Design partitioning
The whole computation is partitioned into two stages.
The first stage:
Computes discrete wavelet transform coefficients of the input image frame and
writes it back to the embedded memory.
The second stage:
Operates on this result to complete the rest of the processing (dynamic
quantization, zero thresholding, run length encoding for zeroes, and
entropy encoding on the coefficients)
The two stages are implemented on two separate FPGAs.

Stage 1: Discrete Wavelet Transform
(2, 2) wavelet:
• A modified form of the Bi-orthogonal (2,2) Cohen-Debuchies-Feaveu
wavelet filter is used. The analysis filter equations are shown below
• The boundary conditions are handled by symmetric extension of the
coefficients as shown below
• The synthesis filter equations are shown below

DWT in X and Y directions
Coefficient ordering along X direction
Coefficient ordering along Y direction

Each pixel in the input frame is represented by 16 bits, accounting for 2 pixels per
memory word. Thus, each memory read brings in two consecutive pixels of a row.

3 stages of wave-letting
High pass and Low pass coefficients at stage 1, X direction

Interleaved ordering along the 3 stages of wave-letting

• Memory addressing is done with a pair of address registers - read and write
address registers.
Stage 1 architecture

The difference between write and read registers is the latency of the pipelined
data-flow blocks.
The maximum and minimum coefficient values for each block (each
quadrantin the multi stage wave-letting) are maintained on the FPGA.
These values are written back to a known location in the lower half (lower
0.5MB) of the embedded memory.
The second stage, uses these values for the dynamic quantization
of the coefficients.

Stage 2
Dynamic quantization
• The coefficients from different sub-bands are quantized separately.
The dynamic range of the coefficients for each sub-band (computed in
first stage) is divided into 16 quantization levels.
• The coefficients are quantized into one of the 16 possible levels.
• The maximum and minimum value of the coefficients for each sub-
band is also needed while decoding the image.
as a binary search tree look up in hardware

Zero thresholding and RLE on zeroes
Stage 2
Different thresholds are used for different sub-bands, resulting in different resolution
in different sub-bands.

Entropy encoding
• The encoding is implemented
by two look-up tables on the
FPGA. Given an eight bit
input, the first look-up table
(LUT), provides information
about the size of encoding.
The second LUT gives the
actual encoding.
• Only the relevant bits from
the second LUT should be
used.
• The rest of the bits in the
output are don’t care and are
either chosen as logic 0 or 1
during logic optimization.
Stage 2
Entropy encoder

Entropy encoding
-Bit packing:
The output of the entropy encoder varies from 3 to 18 bits. The bits need to be
packed into 32 bit words before being written back to the embedded
memory.
This is achieved by the shifter. This shifter is inspired from the Xtetris
computer game and the binary search algorithm.
The shifter consists of 5 register stages, each 32 bits wide. The input data can
be shifted (rotated) by 16 or latched without shifting, to stage 1.
The data can be shifted by 8 or passed on straight from stage 1 to stage 2.
Similarly data can be shifted by 4, 2, and 1 when moving between the
remaining stages.
Data is shifted from stage to stage, and is accumulated at the last stage.
When the last stage has 32 bits of data, a memory write is initiated and the last
stage is flushed.
Stage 2

Output file format
• At the end of the second stage, the upper memory (upper
0.5MB) contains the packed bit stream. The total count of the
bit stream approximated to the nearest WORD is written to
memory location 0. To reconstruct the data from the bit stream,
the following information is needed.
 The actual bit stream. On Huffman decoding, the actual 8 bit
codes are retrieved. These codes are either the quantizer output,
or the RLE count. On expanding the RLE count to the
corresponding number of zeroes, we get the actual quantized
stream.
 The four quadrants of the final stage of wave-letting can be
located at the first four 128*128 byte blocks. The three
quadrants of the next stage can be located at at next three blocks
sized at 256*256 bytes each. Each quadrant (sub-band) is
quantized separately. The dynamic range of each of the quadrant
should be known to reconstruct the original stream.
 The output file written has all the information needed to
reconstruct the image
Stage 2
Outfile format

Stage 2
Stage 2, data flow diagram
Overall architecture

Wavelet coefficients from memory are read from the lower half of the embedded
memory. The block (sub-band) minimum and maximum is also read from the
memory. The packed bit stream output is written to the upper memory, and the bit
stream length is written to memory location 0. The control software, reads the
embedded memory and generates the compressed image file.
Before reading the wavelet coefficients, the maximum and minimum of coefficients in
each sub-band are read from the lower memory. The coefficients are then read and
processed for each sub-band, starting with the lowest frequency band. As shown in
the state diagram, a memory read is fired in stage Read 001. Memory read has a
latency of 2 clock cycles. The results of the read is finally available in state Read
100.
Memory writes are completed in the same cycle. The two intermediate states, Read 010
and Write can be used to write back the output, if output is available.
Each memory read brings in two wavelet coefficients.
Consider the worst case, where the two coefficients gets expanded to 18 bits each.
There are two memory write cycles before the next read. When ever a memory
write is performed, the memory address register is incremented. The read address
generators, read each sub-band from the interleaved memory pattern.
The output is written as a continuous stream, starting with the lowest sub-band. Thus
the output is effectively in Mallot ordering and can be progressively
transmitted/decoded.

Questions??
Discussion!!
Suggestions!!
Criticism!!
178

Wavelet Based Image Compression Using FPGA

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Wavelet Based Image Compression Using FPGA

Similar to Wavelet Based Image Compression Using FPGA (20)

More from Dr. Mohieddin Moradi

More from Dr. Mohieddin Moradi (20)

Recently uploaded

Recently uploaded (20)

Wavelet Based Image Compression Using FPGA