SlideShare a Scribd company logo
1
MOHIEDDIN MORADI
mohieddinmoradi@gmail.com
DREAM
IDEA
PLAN
IMPLEMENTATION
Mathematical Background
Measure of Information
• Random variable: X
• Self information : If the outcome of a random variable X is ai with
probability p(ai), then the self information is given by
• Entropy : The entropy H(X) of a random variable X with the given alphabet
AX and the probabilities pX is defined by
Distortion Measures
• Let x, y be signals, each consisting of n values with a possible range of [0,
xmax] (e.g. [0, 255] for 8 bit images), then:
Downsampling, Upsampling, and Delay
• Downsampling a sequence x by md can be expressed as yn = xn·md
• Upsampling a sequence x by mu can be expressed as
• The downsampling and upsampling operation for signal x are denoted by x
↑ md and x ↓ mu, respectively.
• Consider the sequence y defined by yn = xn−mdly , that is the signal x
delayed by mdly.
• In z-transform domain this can be expressed as
Wavelets
• Wavelets (little waves) are functions that are concentrated in time as well as
in frequency around a certain point
• we mostly mean a pair of functions:
 scaling function φ
 wavelet function Ψ
• The self similarity (refinement condition) of the scaling function φ is
bounded to a filter h and is defined by
• which means that φ remains unchanged if you filter it with h, downsample
it by a factor of two, and amplify the values by , successively2
Refinement condition of the scaling function – In step a the scaling function is
duplicated, translated and scaled in abscissa. In step b the translated and scaled
duplicates are amplified.
Wavelets
• The wavelet function Ψ is built on φ with help of a filter g
• φ and Ψ are uniquely determined by the filters h and g
Wavelets
• φ and Ψ are uniquely determined by the filters h and g
• Variants of these functions are defined, which are translated by an integer,
compressed by a power of two, and usually amplified by a power of
• j denotes the scale – the bigger j the higher the frequency and the thinner the
wavelet peak
• l denotes the translation – the bigger l the more shift to the right, and the
bigger j the smaller the steps
2
Wavelets
Discrete Wavelet Transforms
• The goal is to represent signals as linear combinations of
- wavelet functions at several scales
and
- scaling functions of the widest required scale (i.e. j=J)
• The coefficient c1−J,l and coefficient dj,l for −J < j ≤ 0 describe the transformed
signal we want to feed into a compression routine.
• J corresponds to the number of different scales we can represent, which is
equal to the number of transform levels.
• The bigger J the more coarse structures can be described. course
J=4
j=-3, c-3,l , d-3,l
j=-2, c-2,l , d-2,l
j=-1, c-1,l , d-1,l
j=0, c0,l , d0,l
A basis consisting of scaling and wavelet functions of the CDF(2,2) wavelet – This basis covers
three levels of wavelet functions. Only a finite clip of translates is displayed
Discrete Wavelet Transforms
 For even l, cj,l depends only on hk and gk with even k
 For odd l, cj,l depends only on hk and gk with odd k.
• This is the reason why we will split both g and h in its even and odd
indexed coefficients for most of our investigations.
• It is easy to see that the conversion from wavelet coefficients to signal
values is possible without knowing φ or Ψ , the only information needed,
are the filters which belong to them.
• Under certain conditions, the same is true for the reverse conversion.
• It allows us to limit our view to the filters g and h and hide the functions .
• Thus the computation of this change in representation can be made with the
use of filters.
Discrete Wavelet Transforms
• wavelet analysis or wavelet decomposition : conversion from the original
signal to the wavelet coefficients
• wavelet synthesis or wavelet reconstruction : conversion from the wavelet
coefficients back to the signal or an approximated version of it.
• the analysis scaling and wavelet functions
• the synthesis scaling and wavelet functions
• The corresponding filters are denoted accordingly by
and 
: :
and 
,g and h
: :
,g and h
Discrete Wavelet Transforms
One level of wavelet transform expressed using filters
Low pass
Low pass
High pass
High pass
as coarse version
of the signal x at
half resolution
the differences or details that
are necessary to reconstruct the
original signal x from the
coarse version.
to discard even
and odd indices
after filtering
Discrete Wavelet Transforms
tree levels of wavelet transform
Level 2 j=-2, c-2,l , d-2,l
Level 1 j=-1, c-1,l , d-1,l
Level 0 j=0, c0,l , d0,l
Discrete Wavelet Transforms
Cohen-Daubechies-Feauveau CDF(2,2) Wavelet
(biorthogonal (5,3) wavelet)
• Filter length of 5 and 3 for the low and high pass filters.
• The filters, as well as the scaling and wavelet functions for decomposition
and reconstructions, are symmetric.
• A symmetric filter f always has odd filter length and it holds that
fa+k = fb−k
a and b are the smallest and greatest index l, respectively, where fl is
different from zero.
Symmetry is a very important property if we consider image compression,
because in the absence of symmetry artifacts are introduced around edges.
Cohen-Daubechies-Feauveau CDF(2,2) Wavelet
(biorthogonal (5,3) wavelet)
Cohen-Daubechies-Feauveau CDF(2,2) Wavelet
(biorthogonal (5,3) wavelet)
The relation between the regularity of the synthesis wavelet and the
number of vanishing moments of the analysis wavelet
A biorthogonal wavelet has m vanishing moments if and only if its dual scaling
function generates polynomials up to degree m.
In other words, vanishing moments tend to reduce the number of significant wavelet
coefficients and thus, one should select a wavelet with many of them for the
analysis
On the other hand, regular or smooth synthesis wavelets give good approximations, if
not all coefficients are used for reconstruction, as it is the case for lossy
compression.
To increase the number of vanishing moments of the decomposition wavelet one has to
enlarge the filter length of the corresponding analysis low and high pass filters. That
is, you have a trade off between filter length and number of vanishing moments of
the decomposition wavelet.
In terms of image compression you can improve the compression performance at the
expense of increasing computational power to calculate the filter operations.
.
Lifting Scheme
• An alternative computation method of the discrete wavelet transform.
• In order to be consistent we base our introduction to Lifting on the
CDF(2,2) wavelet, which is taken as explanation example too.
The Lifting Scheme is composed of three steps, namely:
1- Split (also called Lazy Wavelet Transform)
(to split the input signal x into even and odd indexed samples)
2- Predict
(to predict the odd samples based on the evens)
3- Update
(to ensure that the average of the signal is preserved)
even
odd
The odd samples are replaced by
the old ones minus the prediction.
detail coefficients
the coarser version of the input
sequence at half the resolution
approximation
to ensure that the average
of the signal is preserved
Lifting Scheme
With z-transform notation we could express the split and merge step using
downsampling, delay, and upsampling, respectively.
Lifting Scheme
The predict and update steps for the CDF(2,2) wavelet
• Here the predictor is chosen to be linear.
• In that case all detail coefficients will be zero.
• Therefore we have
What are the advantages of this method ?
1. The most important fact is that we do not throw away already computed coefficients as in the
filter bank approach.
2. It is also remarkable, that the wavelet transform can now be computed in place. This means,
that given a finite length signal with n samples we need exactly n memory cells, each of them
capable to store one sample, to compute the transform.
3. Furthermore we reduce the number of operations in order to compute the coefficients of the
next coarser or finer scale, respectively. For the CDF(2,2)-wavelet we save three operations
using the Lifting Scheme in comparison with the traditional filter bank approach.
for the prediction step
for the update step
Integer-to-Integer Mapping
• Obviously, the application of the filter bank approach or the Lifting Scheme leads to
coefficients, which are not integers in general. In the field of hardware image compression it
would be convenient, that coefficients and the pixel of the reconstructed image are integers
too.
• For the special case of the CDF(2,2) wavelet we therefore use the prediction and update steps
as follows
• As a consequence the coefficients of all scales −J < j ≤ 0 can be stored as integers and for all
operations integer arithmetic is sufficient. Note, that the coarser the scale the more bits are
necessary to store the corresponding coefficients. To overcome the growing bitwidth at
coarser scales modular arithmetic can be used in the case of lossless compression.
Lifting for for the CDF(2,2) wavelet
Scheme after Integer-to-Integer Mapping
Wavelet transforms on images
Wavelet transforms on images
• To transform images we can use two dimensional wavelets or apply the one
dimensional transform to the rows and columns of the image successively
as separable two dimensional transform.
images interpretation as two dimensional array I
: The image pixels
:The wavelet transformed image
:The coefficients
Wavelet transforms on images
Reflection at Image Boundary
A row of an image r = (r0, . . . , rN−1)
In order to convolve such a row r with a filter f we have to extend it to infinity
in both directions.
Extended row
Reflection at Image Boundary
There are several choices to choose the values of r’k from outside the interval
[0,N −1].
The most popular one’s are
• padding with zeros,
• periodic extension, or
• symmetric extension.
Reflection at Image Boundary
2D-DWT
After one level of transform we obtain N/2 coefficients c0,l and N/2 coefficients
d0,k
These are given in interleaved order, that is
Because of the split in odd and even indexed positions in the Lifting Scheme.
Usually is rearranged to
Low
Low
High
High
• Since we have restricted the images to be of quadratic size
we can perform at most l = log2 N levels of transform.
2D-DWT
2D-DWT
• In order to preserve the average of a one dimensional signal, or the average
of the brightness of images, we have to consider the normalization factors
after the wavelet transform has taken place
Normalization factors of the CDF(2,2) wavelet in two dimension for each level l, 0<= l < 5
2D-DWT
Lena horizontal low-pass
Lena low + high-pass subsampled
1-level 2-D wavelet decomposition
2-level 2-D wavelet decomposition
3-level 2-D wavelet decomposition
LEVEL 3
LEVEL 2
LEVEL 1
LEVEL 0
Diagrammatic representation of
the dyadic decomposition for three
decomposition levels
The in-place mapping scheme.
The dyadic decomposition is
applied on a hypothetical 8 8
original image.
State of the art Image Compression
Techniques
• In 1993, Shapiro has presented an efficient method to compress wavelet
transformed images (embedded zerotree wavelet (EZW) encoder)
• This EZW encoder exploits the properties of the multi-scale representation.
• An significant improvement of this central idea was introduced by Said and
Pearlman in 1996 (Set Partitioning In Hierarchical Trees (SPIHT)
(pronounced: spite)
Shapiro’s Algorithm (EZW)
A Multi-resolution Analysis Example
Lower octave has higher
resolution and contains higher
frequency information
Shapiro’s Algorithm (EZW)
Parent-Child Relationship of the LL Subband
Shapiro’s Algorithm (EZW)
Tree Structure of Wavelet Coefficients
Parent:
– Coefficient at the coarse scale is
called parent
Children:
– All coefficients corresponding to
the same spatial location at the
next finer scale of similar
orientation
Descendants:
– For a given parent, the set of all
coefficients at all finer scale of
similar orientation, corresponding
to the same location
• Every coefficient at a given scale can be related to a set of coefficients at
the next finer scale of similar orientation.
Shapiro’s Algorithm (EZW)
Parent
Children
Hierarchical trees in multi-level decomposition
coefficients that are in the same spatial
location consist of a quad-tree.
Shapiro’s Algorithm (EZW)
• E – The EZW encoder is based on progressive encoding.
Progressive encoding is also known as embedded encoding
• Z – A data structure called zero-tree is used in EZW algorithm
to encode the data
• W – The EZW encoder is specially designed to use with
wavelet transform. It was originally designed to operate on
images (2-D signals)
Shapiro’s Algorithm (EZW)
• A kind of bitplane coding.
• The kth bits of the coefficients constitute a bitplane.
• A bitplane encoder starts coding with the most significant bit of each
coefficient.
• Within a bitplane the bits of the coefficients with largest magnitude come
first.
The coefficients are shown in decreasing order from left to right.
Each coefficient is represented with eight bit, where the least significant bit is in front.
N
MSB
LSB
M
Sign
coding
order
Shapiro’s Algorithm (EZW)
• self similarities between different scales which result from the recursive
application of the wavelet transform step to the low frequency band
Shapiro’s Algorithm (EZW)
• Shapiro proposes to scan the samples from left to right and from top to
bottom within each sub band, starting in the upper left corner.
• The sub bands LH, HL, and HH at each scales are scanned in that order.
• Furthermore, in contrast to traditional bitplane coders he has introduced
data dependent examination of the coefficients.
• The idea behind is, if there are large areas with unimportant samples in
terms of compression, they should be excluded from exploration. The
addressed self similarities are the key to perform such exclusions of large
areas.
Shapiro’s Algorithm (EZW)
• In order to exploit the self similarities
during the coding process, oriented
trees of outdegree four are taken for the
representation of a wavelet transformed
image.
• Each node of the trees represents a
coefficient of the transformed image.
• The levels of the trees consist of
coefficients at the same scale.
• The trees are rooted at the lowest
frequency subbands of the
representation.
Each coefficient in the LH, HL, and HH subbands of each scale has four children
The coefficients at the highest frequency subbands have no children
There is only one coefficient in the lowest frequency band (DC coefficient) that has three children
Oriented quad tree’s, four transform level, N = 16
Shapiro’s Algorithm (EZW)
A coefficient of the wavelet transformed image is insignificant with respect to a threshold th if its
magnitude (|c|) is smaller than
Otherwise it is called significant with respect to the threshold th.
 Dominant pass
In the dominant pass, the coefficients are scanned in raster order (from left to right and from top
to bottom) within the quadrants. The scan starts with the quadrants of the highest transform
level.
• In each transform level the quadrants are scanned in the order HL, LH, and HH. The
coefficients are coded by symbol P, N, ZTR, or IZ.
A coefficient is coded by:
P, if it is greater than the given threshold and is positive
N, if its absolute value is greater than the given threshold and it is negative
ZTR, if its absolute value is smaller than the given threshold and the absolute value of
all coefficients in the corresponding quad tree are smaller than the threshold, too. (zero
tree root)
IZ, if its absolute value is smaller than the given threshold and there exists at least one
coefficient in the corresponding quad tree that is greater than the given threshold with
respect to the absolute value.(isolated zero)
Shapiro’s Algorithm (EZW)
significantinsignificant
Z :
 It is useed within the high frequency bands of level one only, because all
coefficients in these quadrants could not be root of a zerotree.
 It can thus be seen as the combination of ZTR and IZ for this special case.
Once a coefficient is encoded as the symbol P or N it is not included in the
determination of zerotrees.
 Subordinate pass:
• Each coefficient, that has been coded as P or N in the previous dominant pass,
is now refined, while coding the th-bit of its binary representation.
• This corresponds to a bitplane coding, where the coefficients are refined in data
dependent manner.
• The most important fact hereby is, that no indices of the coefficients under
consideration have to be coded. This is done implicitly, due to the order in
which they become significant (coded as P or N in the dominant pass).
Shapiro’s Algorithm (EZW)
Example: encoding the wavelet
transformed image given in Figure using
the embedded zerotree wavelet algorithm
of Shapiro
the subband HL is excluded and the subband LH is scanned only partially.
Shapiro’s Algorithm (EZW)
The EZW algorithm is based on two observations:
– Natural images in general have a low pass
spectrum. When an image is wavelet
transformed, the energy in the sub-bands
decreases with the scale goes lower (low scale
means high resolution), so the wavelet
coefficient will, on average, be smaller in the
lower levels than in the higher levels.
– Large wavelet coefficients are more important
than small wavelet coefficients.
631 544 86 10 -7 29 55 -54
730 655 -13 30 -12 44 41 32
19 23 37 17 -4 –13 -13 39
25 -49 32 -4 9 -23 -17 -35
32 -10 56 -22 -7 -25 40 -10
6 34 -44 4 13 -12 21 24
-12 -2 -8 -24 -42 9 -21 45
13 -3 -16 -15 31 -11 -10 -17
typical wavelet coefficients
for a 8*8 block in a real image
EZW – basic concepts
higher levels
The observations give rise to the basic progressive coding idea:
1. We can set a threshold T, if the wavelet coefficient is larger than T, then
encode it as 1, otherwise we code it as 0.
2. ‘1’ will be reconstructed as T (or a number larger than T) and ‘0’ will be
reconstructed as 0.
3. We then decrease T to a lower value, repeat 1 and 2. So we get finer and
finer reconstructed data.
The actual implementation of EZA algorithm should consider :
1. What should we do to the sign of the coefficients. (positive or negative) ?
– answer: use POS (P) and NEG (N)
2. Can we code the ‘0’s more efficiently? -- answer: zero-tree
3. How to decide the threshold T and how to reconstruct? –answer: see the
algorithm
EZW – basic concepts
coefficients that are in the same
spatial location consist of a
quad-tree.
• The definition of the zero-tree:
There are coefficients in different subbands that represent the same spatial
location in the image and this spatial relation can be depicted by a quad tree
except for the root node at top left corner representing the DC coeeficient
which only has three children nodes.
• Zero-tree Hypothesis
If a wavelet coefficient c at a coarse scale is insignificant with respect to a
given threshold T, i.e. |c|<T then all wavelet coefficients of the same
orientation at finer scales are also likely to be insignificant with respect to T.
EZW – basic concepts
First step:
The DWT of the entire 2-D image will be computed by FWT
Second step:
Progressively EZW encodes the coefficients by decreasing the threshold
Third step:
Arithmetic coding is used to entropy code the symbols
EZW – basic concepts
Here MAX() means the maximum coefficient value in the image and y(x,y) denotes the
coefficient. With this threshold we enter the main coding loop
threshold = initial_threshold;
do {
dominant_pass(image);
subordinate_pass(image);
threshold = threshold/2;
} while (threshold > minimum_threshold);
The main loop ends when the threshold reaches a minimum value, which could be
specified to control the encoding performance, a “0” minimum value gives the
lossless reconstruction of the image.
The initial threshold t0 is decided as:
Second step:
Progressively EZW encodes the coefficients by decreasing the threshold
EZW – basic concepts
In the dominant_pass
• All the coefficients are scanned in a special order
• If the coefficient is a zero tree root, it will be encoded as ZTR. All its
descendants don’t need to be encoded – they will be reconstructed as
zero at this threshold level
• If the coefficient itself is insignificant but one of its descendants is
significant, it is encoded as IZ (isolated zero).
• If the coefficient is significant then it is encoded as POS (P) or NEG
(N) depends on its sign.
This encoding of the zero tree produces significant compression because gray level images
resulting from natural sources typically result in DWTs with many ZTR symbols. Each
ZTR indicates that no more bits are needed for encoding the descendants of the
corresponding coefficient
EZW – basic concepts
At the end of dominant_pass
• all the coefficients that are in absolute value larger than the current
threshold are extracted and placed without their sign on the subordinate
list and their positions in the image are filled with zeroes. This will
prevent them from being coded again.
In the subordinate_pass
• All the values in the subordinate list are refined. this gives rise to some
juggling with uncertainty intervals and it outputs next most significant
bit of all the coefficients in the subordinate list.
EZW – basic concepts
Wavelet coefficients for a 8*8 block
EZW – an example
The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
Data without any symbol is a node in the zero-tree.
63
POS
-34
NEG
49
POS
10
ZTR
7
IZ
13
IZ
-12 7
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
IZ
4
IZ
6 -1
15
ZTR
14
IZ
3 -12 5 -7 3 9
-9
ZTR
-7
ZTR
-14 8 4 -2 3 2
--5 9 -1
IZ
47
POS
4 6 -2 2
3 0 -3
IZ
2
IZ
3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
EZW – an example
The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
Data without any symbol is a node in the zero-tree.
63
POS
-34
NEG
49
POS
10
ZTR
7
IZ
13
IZ
-12 7
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
IZ
4
IZ
6 -1
15
ZTR
14
IZ
3 -12 5 -7 3 9
-9
ZTR
-7
ZTR
-14 8 4 -2 3 2
--5 9 -1
IZ
47
POS
4 6 -2 2
3 0 -3
IZ
2
IZ
3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
EZW – an example
The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
Data without any symbol is a node in the zero-tree.
63
POS
-34
NEG
49
POS
10
ZTR
7
IZ
13
IZ
-12 7
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
IZ
4
IZ
6 -1
15
ZTR
14
IZ
3 -12 5 -7 3 9
-9
ZTR
-7
ZTR
-14 8 4 -2 3 2
--5 9 -1
IZ
47
POS
4 6 -2 2
3 0 -3
IZ
2
IZ
3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
Significance Map
EZW – an example
The result from the dominant_pass is output as the following:
D1: POS, NEG, IZ, ZTR, POS, ZTR, ZTR, ZTR, ZTR, IZ, ZTR, ZTR, IZ, IZ, IZ, IZ, IZ, POS, IZ, IZ
The significant coefficients are put in a subordinate list and are
refined. A one-bit symbol is output to the decoder.
Original data 63 34 49 47
Output symbol (S1) 1 0 1 0
Reconstructed data 56 40 56 40
For example, the output for 63 is:
sign 32 16 8 4 2 1
0 1 1 ? ? ? ?
If T+.5T is less than data item take the average of 2T and 1.5T, put a 1 in the code.
So 63 will be reconstructed as the average of 48 and 64 which is 56.
If it is more than T+.5T , put a 0 in the code and encode this as T+.5T+.25T.
Thus, 34 is reconstructed as 40.
T+.5T=48
(2T+1.5T)/2=56
T+.5T+.25T=40
EZW – an example
* * * 10 7 13 -12 7
-31 23 14 -13 3 4 6 -1
15 14 3 -12 5 -7 3 9
-9 -7 -14 8 4 -2 3 2
--5 9 -1 * 4 6 -2 2
3 0 -3 2 3 -2 0 4
2 -3 6 -4 3 6 3 6
5 11 5 6 0 3 -4 4
After dominant pass, the significant coefficients will be replaced by * or 0
Then the threshold is divided by 2, so we have 16 as current threshold
Significance Map
EZW – an example
The result from the second dominant pass is output as the following:
D2: IZ, ZTR, NEG, POS, IZ,IZ, IZ, IZ, IZ, IZ, IZ, IZ
The significant coefficients are put in the subordinate list
and all data in this list will be refined as:
Original data 63 34 49 47 31 23
Output symbol 1 0 0 1 1 0
Reconstructed data 60 36 52 44 28 20
For example, the output for 63 is:
sign 32 16 8 4 2 1
0 1 1 1 ? ? ?
The computatin is now extended with respect to the next significant bit. So 63 will be
reconstructed as the average of 56 and 64 –- 60!
EZW – an example
The process is going on until threshold =1, the final output as:
p=pos, n=neg, z=iz, t=ztr , Di= i’th dominant pass, Si= i’th dominant pass output symbol
D1: pnztpttttztttttttptt
S1: 1010
D2: ztnptttttttt
S2: 100110
D3: zzzzzppnppnttnnptpttnttttttttptttptttttttttptttttttttttt
S3: 10011101111011011000
D4: zzzzzzztztznzzzzpttptpptpnptntttttptpnpppptttttptptttpnp
S4: 11011111011001000001110110100010010101100
D5: zzzzztzzzzztpzzzttpttttnptppttptttnppnttttpnnpttpttppttt
S5: 10111100110100010111110101101100100000000110110110011000111
D6: zzzttztttztttttnnttt
For example, the output for 63 is:
sign 32 16 8 4 2 1
0 1 1 1 1 1 1
So 63 will be reconstructed as 32+16+8+4+2+1=63!
Note, how progressive transmission can be done.
EZW – an example
112
1
-2
5
-2
-2
8
3 23
3-430
6264
4
1
02 -3
45
-14
1069-38
27-31
19 17 3 -15
743
8-20156
6
-4 7 -3
3 0 -2
-8 -1 -1 -64
5
7-19-1-9
5
13
113 4
-9
-2
-2
8
3 23
-14
1069-38
743
8-20156
55
13
-9
-2
3-430
6264
4
1
02 -3
45
27
3 -15
7-19
112
1
5
-31
19 17
6
-4 7 -3
3 0 -2
-8 -1 -1 -64
-1-9
113 4
SPIHT- Set Partitioning In Hierarchical Trees
Said and Pearlman have significantly improved the codec of Shapiro.
The main idea is based on partitioning of sets, which consists of coefficients or
representatives of whole subtrees.
They classify the coefficients of a wavelet transformed image in three sets:
 LIP: list of insignificant pixels which contains the coordinates of those
coefficients which are insignificant with respect to the current threshold th.
 LSP: list of significant pixels which contains the coordinates of those coefficients
which are significant with respect to th.
 LIS: list of insignificant sets which contains the coordinates of the roots of
insignificant subtrees.
During the compression procedure, the sets of coefficients in LIS are refined and if
coefficients become significant they are moved from LIP to LSP.
SPIHT- Set Partitioning In Hierarchical Trees
• The first difference to Shapiro’s EZW algorithm is the distinct definition of
the significance. Here, the root of the tree is excluded from the computation
of the significance attribute.
Set O(i, j), D(i, j) and L(i, j):
offspring
SPIHT- Set Partitioning In Hierarchical Trees
Ex: the sets O(i, j), D(i, j) and L(i, j) with i = 1 and j = 1, the labels O, D, L
show, that this coefficients is a member of the corresponding set, (N = 8)
SPIHT- Set Partitioning In Hierarchical Trees
H
TYPE A
TYPE B
SPIHT- Set Partitioning In Hierarchical Trees
In the SPIHT algorithm the signification is computed for the sets D(i, j) and L(i, j).
The root of each quadtree is, in contrast to the algorithm presented by Shapiro, not included in
the computation of the significance.
Significance Attribute
SPIHT- Set Partitioning In Hierarchical Trees
Parent-Child Relationship of the LL Subband
different parent-child relationships of the LL band
SPIHT- Set Partitioning In Hierarchical Trees
SPIHT- Set Partitioning In Hierarchical Trees
O(i,j): set of coordinates of all offspring of node (i,j); children only
D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc.
L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc.
SPIHT Sorting Pass
SPIHT- Set Partitioning In Hierarchical Trees
O(i,j): set of coordinates of all offspring of node (i,j); children only
D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc.
H (i,j): set of all tree roots (nodes in the highest pyramid level); parents
L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc.
SPIHT Refinement Pass
SPIHT- Set Partitioning In Hierarchical Trees
The refinement phase outputs the kth bit
of each elements of list LSP, if it was
not included in the last sorting phase.
Example of SPIHT
All roots D’s of roots
SPIHT- Set Partitioning In Hierarchical Trees
n=4;
for n=3
SPIHT- Set Partitioning In Hierarchical Trees
SPIHT- Set Partitioning In Hierarchical Trees
n=2;
SPIHT- Set Partitioning In Hierarchical Trees
n=2;
SPIHT- Set Partitioning In Hierarchical Trees
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2
SPIHT- Set Partitioning In Hierarchical Trees
Example of 3-scale wavelet transform of an 8 by 8 image.
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2
SPIHT- Set Partitioning In Hierarchical Trees
Example of 3-scale wavelet transform of an 8 by 8 image.
0) Initialization
LSP←{}
LIP ←{(63,-34,-31,23)}
LIS ←{(-34,-31,23)}
1) n=5 (T=25=32)
LSP←{(63,-34,49,47)}
LIP ←{(-31,23),(10,14,-13),
(15,14,-9,-7),(-1,-3,2)}
LIS ←{(23),-34B,(15,-9,-7)}
output:5,1+1-0011+000010000100101+0000
decoding
0
00
00
00
00
00
0
0 0
0
0
00
320
0
00
0
00
00
32 -32
0
0
00
032
0
00
00
00
0
0 0
0
0
0
00
00
00
0
0 0
0 0
00
00
SPIHT- Set Partitioning In Hierarchical Trees
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2
2) n=4 (T=24=16)
LSP←{(63,-34,49,47),(-31,23)}
LIP ←{(10,14,-13),(15,14,-9,-7),(-1,-3,2)}
LIS ←{(23),-34B,(15,-9,-7)}
output:
sorting pass:
4,1-1+000000000000000
refinement pass: 1010
decoding
0
00
00
00
00
00
0
0 0
0
0
00
320
0
00
0
00
00
48 -32
-16
0
00
048
16
00
00
00
0
0 0
0
0
0
00
00
00
0
0 0
0 0
00
00
SPIHT- Set Partitioning In Hierarchical Trees
3) n=3 (T=23=8)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9)}
LIP ←{(-7),(-1,-3,2),(3),(-5,3,0),(2,-3,5),(7,3,4),(7,6,-1),(3,3,2)}
LIS ←{(-7),23B,(14)}
output:
sorting pass:
3,1+1+1-1+1+1-0000101-1-1+01
101+0010001+0101+0011-0000
101+00
refinement pass: 100110
0
00
0-8
00
80
00
0
0 8
0
0
00
400
0
80
0
00
00
56 -32
-24
8
-88
848
16
8-8
-80
0-8
8
0 0
0
0
0
00
00
00
0
0 8
0 0
00
00
decoding
SPIHT- Set Partitioning In Hierarchical Trees
4) n=2 (T=22=4)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4)}
LIP ←{(-1,-3,2),(3),(3,0),(2,-3),(3),(-1),(3,3,2),(-2),(3,-2),(-2,2,0),
(3,0,3),(3)}
LIS ←{}
output:
sorting pass:
2,1-00001-00001+1+01+1+1+0000
11+1-1+1+111+1-1+011+1+00100
01+101+00101+1-1+
refinement pass:
10011101111011000110
4
04
4-12
00
80
04
-4
-4 8
0
0
00
440
0
84
0
44
-44
60 -32
-28
12
-1212
848
20
8-12
-120
-4-8
12
4 4
0
0
0
4-4
40
00
4
4 12
0 4
40
00
decoding
SPIHT- Set Partitioning In Hierarchical Trees
5) n=1 (T=21=2)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4),
(-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3)}
LIP ←{(-1),(0),(-1),(0),(0)}
LIS ←{}
output:
sorting pass:
1,01-1+1+1+01+1-1+01+1+1+1-1+
1-1-1+01+01+1+
refinement pass:
1101111101100100100010010
1110010100101100
decoding
4
06
6-12
22
82
-24
-6
-4 8
2
2
2-2
460
0
104
-2
64
-46
62 -34
-30
14
-1214
1048
22
8-14
-122
-6-8
14
4 6
2
2
-2
4-4
62
20
6
6 12
2 4
40
2-2
SPIHT- Set Partitioning In Hierarchical Trees
6) n=0 (T=20=1)
LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11,
13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4),
(-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3),(-1,-1)}
LIP ←{(0),(0),(0)}
LIS ←{}
output:
sorting pass:
0,1-01-00
refinement pass:
1011110011010001110111110
1000101100000000101101111
001000111
decoding
5
-16
7-12
23
93
-24
-7
-5 9
3
2
2-3
47-1
0
115
-3
65
-46
63 -34
-31
15
-1314
1049
23
8-14
-123
-7-9
14
4 6
3
3
-2
4-4
63
30
6
7 13
3 4
40
2-2
SPIHT- Set Partitioning In Hierarchical Trees
The basic Algorithm
LIP: List of Insignificant Pixel
LIS: List of Insignificant Set
LSP: List of Significant Pixel
SPIHT- Set Partitioning In Hierarchical Trees
All roots
D’s of roots
The basic Algorithm
SPIHT- Set Partitioning In Hierarchical Trees
Partitioned Approach
What are the advantages and disadvantages of this partitioned approach?
+ The internal memory requirements are dramatically reduced.
+ The subimages can be transformed independently of each other.
+ The traditional 2D-DWT can be directly applied to the subimages without any modifications.
− It is not applicable for lossy compression.
Boundary Treatment
which pixels have to be considered in order to compute the low and high pass coefficients
using the CDF(2,2)?
wavelet partitioned wavelet transform without boundary treatment introduces block
artifacts targeting to lossy compression
the computation of the low pass coefficients and their dependences on the coefficients of the previous scales
area of coefficients, which have to be incorporated into the computation of the transform of the partition under consideration
Modifications to the SPIHT Codec
In this section we present the modifications necessary to obtain an efficient
hardware implementation of the SPIHT compressor based on the
partitioned approach to wavelet transform images.
 At first, we exchange the sorting phase with the refinement phase to save
memory for status information.
 The greatest challenge is the hardware implementation of the three lists
LIP, LSP, and LIS.
Exchange of Sorting and Refinement Phase
In the basic SPIHT algorithm status information has to be stored for the
elements of LSP specifying whether the corresponding coefficient has been
added to LSP in the current iteration of the sorting phase (see Line 20).
In the worst case all coefficients become significant in the same iteration.
Consequently, we have to provide a memory capacity of q^2 bits to store
this information.
However, if we exchange the sorting and the refinement phase, we do not need
to consider this information anymore.
The compressed data stream is still decodable and it is not increased in size.
Of course, we have to consider that there is a reordering of the transmitted bits
during an iteration.
Memory Requirements of the Ordered Lists
To obtain an efficient realization of the lists LIP, LSP, and LIS, we first have to specify
the operations that take place on these lists and deduce worst case space
requirements in a software implementation.
Estimations for LIS
We have to provide the following operations for LIS:
• initialize as empty list (Line 0),
• append an element (Line 0 and 19),
• sequentially iterate the elements (Line 5),
• delete the element under consideration (Line 15 and 19),
• move the element under consideration to the end of the list and change the type (Line
14).
LIS contains at most elements.
To specify the coordinates of an element, 2 log2 (q/2) bits are needed.
A further bit to code the type information has to be added per element. This
results in an overall space requirement for LIS of bits.
Estimations for LIP and LSP
Now, let us consider both the list LIP and the list LSP because they can be
implemented together. Again,
we start with the operations applied to both lists:
• initialize as empty list (Line 0),
• append an element (Line 0, 4, 11, and 12),
• sequentially iterate the elements (Line 20),
• delete the element under consideration from LIP (Line 4).
the overall space requirement for both lists is q^2 · 2 log2 q bits.
FPGA architectures
Prototyping Environment
The prototyping environment provides several
mechanism to exchange data between the
mounted Xilinx device, the local SRAM and the
PC main memory.
These could be categorized into:
1. direct access to registers/latches (configured
into the FPGA)
2. access to memory cells of the local SRAM
(all data transfers are going through the FPGA)
3. DMA transfers and DMA on demand
transfers
4. interrupts.
To communicate with the PCI card one has to write a simple C/C++ program, which initializes the card, configures
the FPGA, set the clock rate and starts the data transfer or the computation.
• The Xilinx XC4085 device
consists of a matrix of
Configurable Logic Blocks
(CLB) with 56 rows and 56
columns.
• These CLBs are SRAM based
and can be configured many
times.
• After shut down the power
supply they have to be
reconfigured.
• It can be configured to represent
any function with 4 or 5 inputs
(function generator F, G, or H).
• there are mainly two flip-flops,
two tristate drivers and the so
called carry-logic available.
The Xilinx XC4085 XLA device
The routing is done using programmable switch matrices (PSM)
(a) interconnections between the CLBs
(b) switch matrices in XC4085XLA
• Each CLB contains hard-wired carry logic to accelerate arithmetic operations.
• Fast adders can be constructed as simple ripple carry adders, using the special carry logic to
calculate and propagate the carry between full adders in adjacent CLBs. One CLB is capable
of realizing two full adders.
The Xilinx XC4085 XLA device
• Each CLB can be configured to
provide internal RAM.
• In order to do this, conceive the
four input signals to F and G as
address lines.
• Thus a CLB provides two 16×1 or
one 32×1 random access memory
module, respectively.
• At maximum we have 12kbyte
RAM available.
• Each RAM block can be configured
with different behavior:
•synchronous RAM: edge-triggered
• asynchronous RAM: level-sensitive
• single-port-RAM
• dual-port-RAM
The Xilinx XC4085 XLA device
2D-DWT FPGAArchitectures targeting Lossless Compression
In the following all available architectures ( 4 architectures) are listed.
• one Lifting unit
– data transfer from the PC directly to the internal memory and vice versa
– data transfer from the PC to the SRAM on the prototyping board vice versa
• four Lifting units working in parallel
– data transfer from the PC to internal memory and vice versa
– data transfer from the PC to SRAM on the prototyping board vice versa
The implementations with four Lifting units
introduce parallelism with respect to the rows of
a subimage.
Once the FPGA is configured, a whole image
can be transfered to the SRAM of the PCI card
(maximum size 2Mbyte). Each subimage is then
loaded into the internal memory, wavelet
transformed, and the result is stored in the
SRAM where it can be transfered back to the
PC.
Another choice is to write a subimage directly
into the internal memory of the FPGA, start the
transform, and read the wavelet transformed
subimage back to the PC.
The computation itself is started if an internal
status register is set by the software interface.
The RAM itself is further decomposed in
smaller units to support the parallel execution of
the four Lifting units. Its is capable to store a
subimage of size 32×32.
There exists two separate finite state machines
to control the wavelet transform and its inverse,
respectively. the global data path diagram of a circuit with four parallel
working lifting units.
2D-DWT FPGAArchitectures targeting Lossy Compression
we will distinguish between two different architectures for the partitioned
discrete wavelet transform on images.
I. 2D-DWT FPGAArchitecture based on Divide and Conquer Technique
II. 2D-DWT FPGA Pipelined Architecture
2D-DWT FPGAArchitecture based on Divide and Conquer Technique
in order to compute one level of CDF(2,2)-wavelet transform in one dimension we
need two pixels to the left and one pixel to the right of each rows of an image,
respectively.
Thus, we append a row of an image by one and two memory cells on the right and on
the left, respectively. We called such a enlarged row extended row.
Let r be a row of an image of length 16. Now, the problem is split in two subproblems
which are solved interlocked. The first module computes the ’inner’ coefficients c0,0, . . . ,
c0,15 and d0,0, . . . , d0,15 whichonly depend on the extended row. In order to proceed to the
next level of the wavelet transform, the coefficients at appended positions have to be made
topical, i.e., the coefficients c0,−2, c0,−1, and c0,16 have to be computed. The first module
which is responsible for the inner coefficients computes the subtotals of c0,−2, c0,−1, and
c0,16 that only depend on the extended row. The second module computes the subtotals of
c0,−2, c0,−1, and c0,16 that depend on pixels not in row.
2D-DWT FPGA Pipelined Architecture
• two one dimensional DWT units (1D-DWT) for horizontal
and vertical transforms
• a control unit as a finite state machine
• an internal memory block
To process a subimage, all rows are transfered to the FPGA over
the PCI bus and transformed on the fly in the horizontal
1D-DWT unit using pipelining. The coefficients computed
in this way are stored in internal memory of different types.
The coefficients corresponding to the rows of the subimage
itself are stored in single port RAM.
Now the vertical transform levels can take place. This is done by
the vertical 1D-DWT unit.
The control unit coordinates these steps in order to process a
whole subimage and is responsible for generating enable
signals, address lines, and so on.
At the end, the wavelet transformed subimage is available in the
internal RAM. At this point an EZW algorithm can be
applied to the multiscale representation of the subimage.
Since all necessary boundary information was included in the
computation, no block artefacts are introduced by the
following quantization.
control unit
4-level Horizontal 1D-DWT unit
The whole horizontal transform is done for the 16 rows of the subimage under consideration.
In addition, 30 rows of the neighboring subimage in the north and 15 rows of the southern
subimage are transformed in the same manner. These additional computations are required by the
vertical DWT applied next.
This unit has to take four pixels of a row at each clock cycle and must perform 4 levels of
horizontal transforms.
the unit consists of four pipelined stages, one for each transform level
high frequency
coefficients
low frequency
coefficients
high frequency
coefficients
low frequency
coefficients
second stage
F=f0/2
first stage
F =f0
even
even
even
evenodd
odd
odd
odd
input for the third stage
Third stage
Forth stage
alternately outputs a low or a high
frequency coefficient at one clock cycle
alternately outputs a low or a high
frequency coefficient at one clock cycle
Recall:
Lifting for for the CDF(2,2) wavelet Scheme after Integer-to-Integer Mapping
The w-bit input 1D-DWT unit (1i2o)
To implement one level of DWT using the lifting method the following steps are
necessary:
• split the input into coefficients at odd and even positions
• perform a predict-step, that is the operation given in
• perform an update-step, that is the operation given in
predict unit
update unit
w-bit input 1D-DWT unit (accordingly to the lifting scheme)
The unit consists of two register chains.
The registers in the upper chain are enabled at even, the registers in the lower chain at
odd clock edges. This splits the input into words at even and odd positions.
Now the predict and update steps can be applied straightforward.
The 4w-bit input 1D-DWT unit (4i4o)
takes four pixels of the same row (i) at a time
We use this unit for both the first and the second level of the transform
• The internal memory for the wavelet coefficients is capable to store 16×16
coefficients.
• Since the bitwidth of the coefficients differs with their corresponding subbands the
memory block consists of 5 slices.
• InFigure (a) we have shown once again the minimal affordable bitwidth for each
subbands.
• The structure of the internal memory is illustrated in Figure (b).
Internal Coefficient Memory
FPGA-Implementation of the Modified SPIHT Encoder
To store wavelet coefficients
LP := LIP U LSP SL and SD store the pre-computed significance
attributes for all thresholds
block diagram of modified SPIHT compressor
size in bits of each RAM block
In comparison to the algorithm SPIHT Image compression ,modified SPIHT Image
compression could reduce the internal memory from
To
(N = 512, d0 = 11)
• Each subimage of the wavelet transformed image is
transferred once to the internal memory module named
’coeff’ or is already stored there.
• At first, the initialization of the modules representing
LIP, LSP, and LIS and the computation of the
significances is done in parallel.
• The lists LIP and LSP are managed by the module
’LP’, the bitmap of LIS by the module ’LIS’.
• The significances of sets are computed for all
thresholds th ≤ kmax at once and are stored in the
modules named ’SL’ and ’SD’, respectively.
• Here we distinguish between the significances for the
sets L and D.
• With this information the compression can be started
with bit plane kmax.
• Finite state machines control the overall procedure.
• The data to be output is registered in module ’output’
from which it is put to the local SRAM on the PCI card
on a 32 bit wide data bus.
• Additionally, an arithmetic coder can be configured
into that module. This further reduces the compression
ratio.
the overall functionality
Hardware Implementation of the Lists
• To reduce the memory requirement for the list data structures in the worst case, we
implement the lists as bitmaps.
• The bitmap of list L represents the characteristic function of list L.
• The RAM module which realizes LIP and LSP has a configuration of q × q
entries of bit length 2 as for each pixel of the q × q subimage either
(i, j) LIP, (i, j) LSP, or (i, j) LIP LSP holds.
• The second RAM module implements LIS.
   
possible configuration states of a coordinate (i, j)
(0)
(0) (0)
(1)
(1) (1)
(1)
Since none of the coefficients in these
subbands can be the root of a zerotree, we
have to provide a bitmap of size (q/2)^2 bits
Coefficients can
be of type B
Coefficients
always are of
type A
only for the area
which corresponds to
LL(1) the type
information has to be
stored. This results in
additional (q/4)^2 bits
Efficient Computation of Significances
 Computing significance of an individual coefficient: The significance of an individual
coefficient is trivial to compute.
Just select the kth bit of |ci,j | in order to obtain Sk(i, j). This can be realized by using bit
masks and a multiplexer.
 Computing significance of sets for all thresholds in parallel:
We define S*(T ) as
 Thus, S*(T ) stands for the maximum threshold k for which some coefficient in T
becomes significant.
 Once S*(T ) is computed for all sets L and D, we have preprocessed the
significances of sets for all thresholds. In order to do this, we use the two RAM
modules SL and SD. They are organized as following memory, respectively.
SL SD
• The computation is done bottom up in the hierarchy defined by the spatial oriented
trees.
• The entries of both RAMs are initialized with zero.
• Now, let (e, f) be a coordinate with 0 < e, f < q just handled by the bottom up
process and let (i,j)= ([e/2],[f/2]) be the parent of (e, f) if it exists. Then SD and SL
have to be updated by the following process
• After reset we start the computation in state
one and initialize k’max and the row and
column indices e and f.
• At this time SD(e, f) and SL(e, f) hold their
old values from the last subimage under
consideration for all 0 < e, f < q.
• If the enable signal becomes active, we
proceed in state 2. Here we buffer the
present value of SD(e, f).
• In the states 3, 4, 5, and 6 we compute line
(6.1).
• The condition e, f are odd checks, if we
visit a 2 × 2 coefficient block for the first
time.
• The states 2, 5, and 6 are responsible for
computing the maximum of SD(i, j) and
S*(e, f) (line (6.2)), which is buffered in tS.
• State 8 performs the assignment in line
(6.3). Furthermore, this finite state machine
updates the value k’ max = kmax + 1
• for the subimage under consideration.
• In state 10 the low frequency coefficient at
position (0, 0) will be included in this
computation, too.
• The operation in state 9 is done using a
simple subtractor and a combined
representation with interleaved bitorder of
the row and column index e and f, that is
fn−1, en−1, fn−2, en−2, . . . , f1, e1, f0, e0.
My implementation
Lifting Scheme
s=a+ (b/2)
d=b-a
Hardware platform
The hardware platform used [WILDFORCE] is a PCI plug-in board with five
Xilinx 4085 FPGAs, also referred to as PEs (Processing Elements).
The board is stacked with five 1MB SRAM chips.
Each of the five SRAM chips are directly connected to one of the five PEs.
The embedded memory is accessible for read/write from both the host
computer as well as from the corresponding PE.
Each of the 1MB memory chip is organized as 262144 words of 32 bits each.
Memory read/write
• The input image : 512 by 512 pixels
• Input frames are loaded to the embedded memory by the host computer
and results are read back, once the PE has processed it.
• The PE also uses the embedded memory as intermediate storage to hold
results between different stages of processing.
• Memory reads can be pipelined so that the effects of this latency is
minimized.
Design partitioning
The whole computation is partitioned into two stages.
The first stage:
Computes discrete wavelet transform coefficients of the input image frame and
writes it back to the embedded memory.
The second stage:
Operates on this result to complete the rest of the processing (dynamic
quantization, zero thresholding, run length encoding for zeroes, and
entropy encoding on the coefficients)
The two stages are implemented on two separate FPGAs.
Stage 1: Discrete Wavelet Transform
(2, 2) wavelet:
• A modified form of the Bi-orthogonal (2,2) Cohen-Debuchies-Feaveu
wavelet filter is used. The analysis filter equations are shown below
• The boundary conditions are handled by symmetric extension of the
coefficients as shown below
• The synthesis filter equations are shown below
DWT in X and Y directions
Coefficient ordering along X direction
Coefficient ordering along Y direction
Each pixel in the input frame is represented by 16 bits, accounting for 2 pixels per
memory word. Thus, each memory read brings in two consecutive pixels of a row.
3 stages of wave-letting
High pass and Low pass coefficients at stage 1, X direction
Interleaved ordering along the 3 stages of wave-letting
• Memory addressing is done with a pair of address registers - read and write
address registers.
Stage 1 architecture
The difference between write and read registers is the latency of the pipelined
data-flow blocks.
The maximum and minimum coefficient values for each block (each
quadrantin the multi stage wave-letting) are maintained on the FPGA.
These values are written back to a known location in the lower half (lower
0.5MB) of the embedded memory.
The second stage, uses these values for the dynamic quantization
of the coefficients.
Stage 2
Dynamic quantization
• The coefficients from different sub-bands are quantized separately.
The dynamic range of the coefficients for each sub-band (computed in
first stage) is divided into 16 quantization levels.
• The coefficients are quantized into one of the 16 possible levels.
• The maximum and minimum value of the coefficients for each sub-
band is also needed while decoding the image.
as a binary search tree look up in hardware
Zero thresholding and RLE on zeroes
Stage 2
Different thresholds are used for different sub-bands, resulting in different resolution
in different sub-bands.
Entropy encoding
Stage 2
Entropy encoding
• The encoding is implemented
by two look-up tables on the
FPGA. Given an eight bit
input, the first look-up table
(LUT), provides information
about the size of encoding.
The second LUT gives the
actual encoding.
• Only the relevant bits from
the second LUT should be
used.
• The rest of the bits in the
output are don’t care and are
either chosen as logic 0 or 1
during logic optimization.
Stage 2
Entropy encoder
Entropy encoding
-Bit packing:
The output of the entropy encoder varies from 3 to 18 bits. The bits need to be
packed into 32 bit words before being written back to the embedded
memory.
This is achieved by the shifter. This shifter is inspired from the Xtetris
computer game and the binary search algorithm.
The shifter consists of 5 register stages, each 32 bits wide. The input data can
be shifted (rotated) by 16 or latched without shifting, to stage 1.
The data can be shifted by 8 or passed on straight from stage 1 to stage 2.
Similarly data can be shifted by 4, 2, and 1 when moving between the
remaining stages.
Data is shifted from stage to stage, and is accumulated at the last stage.
When the last stage has 32 bits of data, a memory write is initiated and the last
stage is flushed.
Stage 2
Stage 2
Output file format
• At the end of the second stage, the upper memory (upper
0.5MB) contains the packed bit stream. The total count of the
bit stream approximated to the nearest WORD is written to
memory location 0. To reconstruct the data from the bit stream,
the following information is needed.
 The actual bit stream. On Huffman decoding, the actual 8 bit
codes are retrieved. These codes are either the quantizer output,
or the RLE count. On expanding the RLE count to the
corresponding number of zeroes, we get the actual quantized
stream.
 The four quadrants of the final stage of wave-letting can be
located at the first four 128*128 byte blocks. The three
quadrants of the next stage can be located at at next three blocks
sized at 256*256 bytes each. Each quadrant (sub-band) is
quantized separately. The dynamic range of each of the quadrant
should be known to reconstruct the original stream.
 The output file written has all the information needed to
reconstruct the image
Stage 2
Outfile format
Stage 2
Stage 2, data flow diagram
Overall architecture
Wavelet coefficients from memory are read from the lower half of the embedded
memory. The block (sub-band) minimum and maximum is also read from the
memory. The packed bit stream output is written to the upper memory, and the bit
stream length is written to memory location 0. The control software, reads the
embedded memory and generates the compressed image file.
Before reading the wavelet coefficients, the maximum and minimum of coefficients in
each sub-band are read from the lower memory. The coefficients are then read and
processed for each sub-band, starting with the lowest frequency band. As shown in
the state diagram, a memory read is fired in stage Read 001. Memory read has a
latency of 2 clock cycles. The results of the read is finally available in state Read
100.
Memory writes are completed in the same cycle. The two intermediate states, Read 010
and Write can be used to write back the output, if output is available.
Each memory read brings in two wavelet coefficients.
Consider the worst case, where the two coefficients gets expanded to 18 bits each.
There are two memory write cycles before the next read. When ever a memory
write is performed, the memory address register is incremented. The read address
generators, read each sub-band from the interleaved memory pattern.
The output is written as a continuous stream, starting with the lowest sub-band. Thus
the output is effectively in Mallot ordering and can be progressively
transmitted/decoded.
Stage 2, control flow diagram
Questions??
Discussion!!
Suggestions!!
Criticism!!
178

More Related Content

What's hot

Image Denoising Using Wavelet
Image Denoising Using WaveletImage Denoising Using Wavelet
Image Denoising Using Wavelet
Asim Qureshi
 
EEP306: Delta modulation
EEP306: Delta modulationEEP306: Delta modulation
EEP306: Delta modulation
Umang Gupta
 
Nyquist criterion for distortion less baseband binary channel
Nyquist criterion for distortion less baseband binary channelNyquist criterion for distortion less baseband binary channel
Nyquist criterion for distortion less baseband binary channel
PriyangaKR1
 
Implementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domainImplementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domain
Zara Tariq
 
Tele3113 wk9wed
Tele3113 wk9wedTele3113 wk9wed
Tele3113 wk9wedVin Voro
 
Pcm
PcmPcm
pulse shaping and equalization
pulse shaping and equalizationpulse shaping and equalization
pulse shaping and equalization
remyard
 
Good denoising using wavelets
Good denoising using waveletsGood denoising using wavelets
Good denoising using waveletsbeenamohan
 
Sampling
SamplingSampling
Pulse Code Modulation
Pulse Code Modulation Pulse Code Modulation
Pulse Code Modulation
ZunAib Ali
 
Digital communication unit II
Digital communication unit IIDigital communication unit II
Digital communication unit II
Gangatharan Narayanan
 
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)Shahrin Ahammad
 
Companding & Pulse Code Modulation
Companding & Pulse Code ModulationCompanding & Pulse Code Modulation
Companding & Pulse Code Modulation
Yeshudas Muttu
 
Image Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain FiltersImage Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain Filters
Suhaila Afzana
 
Introduction to wavelet transform
Introduction to wavelet transformIntroduction to wavelet transform
Introduction to wavelet transform
Raj Endiran
 
Wavelet transform and its applications in data analysis and signal and image ...
Wavelet transform and its applications in data analysis and signal and image ...Wavelet transform and its applications in data analysis and signal and image ...
Wavelet transform and its applications in data analysis and signal and image ...Sourjya Dutta
 
Adaptive delta modulation of Speech signal
Adaptive delta modulation of Speech signalAdaptive delta modulation of Speech signal
Adaptive delta modulation of Speech signal
Sai Malleswar
 
Butterworth filter design
Butterworth filter designButterworth filter design
Butterworth filter designSushant Shankar
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
Dr. Mohieddin Moradi
 
Delta modulation
Delta modulationDelta modulation
Delta modulation
Hanu Kavi
 

What's hot (20)

Image Denoising Using Wavelet
Image Denoising Using WaveletImage Denoising Using Wavelet
Image Denoising Using Wavelet
 
EEP306: Delta modulation
EEP306: Delta modulationEEP306: Delta modulation
EEP306: Delta modulation
 
Nyquist criterion for distortion less baseband binary channel
Nyquist criterion for distortion less baseband binary channelNyquist criterion for distortion less baseband binary channel
Nyquist criterion for distortion less baseband binary channel
 
Implementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domainImplementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domain
 
Tele3113 wk9wed
Tele3113 wk9wedTele3113 wk9wed
Tele3113 wk9wed
 
Pcm
PcmPcm
Pcm
 
pulse shaping and equalization
pulse shaping and equalizationpulse shaping and equalization
pulse shaping and equalization
 
Good denoising using wavelets
Good denoising using waveletsGood denoising using wavelets
Good denoising using wavelets
 
Sampling
SamplingSampling
Sampling
 
Pulse Code Modulation
Pulse Code Modulation Pulse Code Modulation
Pulse Code Modulation
 
Digital communication unit II
Digital communication unit IIDigital communication unit II
Digital communication unit II
 
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)
Amplitude, Frequency, Pulse code modulation and Demodulation (com. lab)
 
Companding & Pulse Code Modulation
Companding & Pulse Code ModulationCompanding & Pulse Code Modulation
Companding & Pulse Code Modulation
 
Image Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain FiltersImage Smoothing using Frequency Domain Filters
Image Smoothing using Frequency Domain Filters
 
Introduction to wavelet transform
Introduction to wavelet transformIntroduction to wavelet transform
Introduction to wavelet transform
 
Wavelet transform and its applications in data analysis and signal and image ...
Wavelet transform and its applications in data analysis and signal and image ...Wavelet transform and its applications in data analysis and signal and image ...
Wavelet transform and its applications in data analysis and signal and image ...
 
Adaptive delta modulation of Speech signal
Adaptive delta modulation of Speech signalAdaptive delta modulation of Speech signal
Adaptive delta modulation of Speech signal
 
Butterworth filter design
Butterworth filter designButterworth filter design
Butterworth filter design
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
 
Delta modulation
Delta modulationDelta modulation
Delta modulation
 

Similar to Wavelet Based Image Compression Using FPGA

Lec11.ppt
Lec11.pptLec11.ppt
Lec11.ppt
ssuser637f3e1
 
Dct,gibbs phen,oversampled adc,polyphase decomposition
Dct,gibbs phen,oversampled adc,polyphase decompositionDct,gibbs phen,oversampled adc,polyphase decomposition
Dct,gibbs phen,oversampled adc,polyphase decompositionMuhammad Younas
 
Basics of edge detection and forier transform
Basics of edge detection and forier transformBasics of edge detection and forier transform
Basics of edge detection and forier transform
Simranjit Singh
 
control engineering revision
control engineering revisioncontrol engineering revision
control engineering revision
ragu nath
 
A Review on Image Denoising using Wavelet Transform
A Review on Image Denoising using Wavelet TransformA Review on Image Denoising using Wavelet Transform
A Review on Image Denoising using Wavelet Transform
ijsrd.com
 
Pulse modulation
Pulse modulationPulse modulation
Pulse modulationavocado1111
 
Digital Image Processing_ ch3 enhancement freq-domain
Digital Image Processing_ ch3 enhancement freq-domainDigital Image Processing_ ch3 enhancement freq-domain
Digital Image Processing_ ch3 enhancement freq-domainMalik obeisat
 
DSP_FOEHU - Lec 11 - IIR Filter Design
DSP_FOEHU - Lec 11 - IIR Filter DesignDSP_FOEHU - Lec 11 - IIR Filter Design
DSP_FOEHU - Lec 11 - IIR Filter Design
Amr E. Mohamed
 
Wavelet Transform and DSP Applications
Wavelet Transform and DSP ApplicationsWavelet Transform and DSP Applications
Wavelet Transform and DSP Applications
University of Technology - Iraq
 
Gaussian wavelet mahesh
Gaussian wavelet maheshGaussian wavelet mahesh
Gaussian wavelet mahesh
Mahesh Chaganti
 
Wavelet transform
Wavelet transformWavelet transform
Wavelet transform
Twinkal
 
Digital Signal Processing
Digital Signal ProcessingDigital Signal Processing
Digital Signal Processing
PRABHAHARAN429
 
Filters2
Filters2Filters2
Filters2
Krish Kan
 
Wavelet transform in two dimensions
Wavelet transform in two dimensionsWavelet transform in two dimensions
Wavelet transform in two dimensions
Ayushi Gagneja
 
dsp
dspdsp
SignalDecompositionTheory.pptx
SignalDecompositionTheory.pptxSignalDecompositionTheory.pptx
SignalDecompositionTheory.pptx
PriyankaDarshana
 
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
csijjournal
 
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE  REDUCTION IN ULTRASONIC B-MODE...A HYBRID DENOISING APPROACH FOR SPECKLE NOISE  REDUCTION IN ULTRASONIC B-MODE...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE...
csijjournal
 

Similar to Wavelet Based Image Compression Using FPGA (20)

Lec11.ppt
Lec11.pptLec11.ppt
Lec11.ppt
 
Dct,gibbs phen,oversampled adc,polyphase decomposition
Dct,gibbs phen,oversampled adc,polyphase decompositionDct,gibbs phen,oversampled adc,polyphase decomposition
Dct,gibbs phen,oversampled adc,polyphase decomposition
 
Basics of edge detection and forier transform
Basics of edge detection and forier transformBasics of edge detection and forier transform
Basics of edge detection and forier transform
 
control engineering revision
control engineering revisioncontrol engineering revision
control engineering revision
 
A Review on Image Denoising using Wavelet Transform
A Review on Image Denoising using Wavelet TransformA Review on Image Denoising using Wavelet Transform
A Review on Image Denoising using Wavelet Transform
 
Pulse modulation
Pulse modulationPulse modulation
Pulse modulation
 
Digital Image Processing_ ch3 enhancement freq-domain
Digital Image Processing_ ch3 enhancement freq-domainDigital Image Processing_ ch3 enhancement freq-domain
Digital Image Processing_ ch3 enhancement freq-domain
 
DSP_FOEHU - Lec 11 - IIR Filter Design
DSP_FOEHU - Lec 11 - IIR Filter DesignDSP_FOEHU - Lec 11 - IIR Filter Design
DSP_FOEHU - Lec 11 - IIR Filter Design
 
Review (1)
Review (1)Review (1)
Review (1)
 
Wavelet Transform and DSP Applications
Wavelet Transform and DSP ApplicationsWavelet Transform and DSP Applications
Wavelet Transform and DSP Applications
 
Gaussian wavelet mahesh
Gaussian wavelet maheshGaussian wavelet mahesh
Gaussian wavelet mahesh
 
Final document
Final documentFinal document
Final document
 
Wavelet transform
Wavelet transformWavelet transform
Wavelet transform
 
Digital Signal Processing
Digital Signal ProcessingDigital Signal Processing
Digital Signal Processing
 
Filters2
Filters2Filters2
Filters2
 
Wavelet transform in two dimensions
Wavelet transform in two dimensionsWavelet transform in two dimensions
Wavelet transform in two dimensions
 
dsp
dspdsp
dsp
 
SignalDecompositionTheory.pptx
SignalDecompositionTheory.pptxSignalDecompositionTheory.pptx
SignalDecompositionTheory.pptx
 
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE ...
 
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE  REDUCTION IN ULTRASONIC B-MODE...A HYBRID DENOISING APPROACH FOR SPECKLE NOISE  REDUCTION IN ULTRASONIC B-MODE...
A HYBRID DENOISING APPROACH FOR SPECKLE NOISE REDUCTION IN ULTRASONIC B-MODE...
 

More from Dr. Mohieddin Moradi

Video Quality Control
Video Quality ControlVideo Quality Control
Video Quality Control
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2
Dr. Mohieddin Moradi
 
HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1
Dr. Mohieddin Moradi
 
SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2
Dr. Mohieddin Moradi
 
SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1
Dr. Mohieddin Moradi
 
Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3
Dr. Mohieddin Moradi
 
Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2
Dr. Mohieddin Moradi
 
Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1
Dr. Mohieddin Moradi
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2
Dr. Mohieddin Moradi
 
An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1
Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4
Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2
Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2
Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1
Dr. Mohieddin Moradi
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Dr. Mohieddin Moradi
 

More from Dr. Mohieddin Moradi (20)

Video Quality Control
Video Quality ControlVideo Quality Control
Video Quality Control
 
HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5
 
HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6
 
HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4
 
HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3
 
HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2
 
HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1
 
SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2
 
SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1
 
Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3
 
Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2
 
Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2
 
An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
 
Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
 

Recently uploaded

Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 

Recently uploaded (20)

Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 

Wavelet Based Image Compression Using FPGA

  • 3. Measure of Information • Random variable: X • Self information : If the outcome of a random variable X is ai with probability p(ai), then the self information is given by • Entropy : The entropy H(X) of a random variable X with the given alphabet AX and the probabilities pX is defined by
  • 4. Distortion Measures • Let x, y be signals, each consisting of n values with a possible range of [0, xmax] (e.g. [0, 255] for 8 bit images), then:
  • 5. Downsampling, Upsampling, and Delay • Downsampling a sequence x by md can be expressed as yn = xn·md • Upsampling a sequence x by mu can be expressed as • The downsampling and upsampling operation for signal x are denoted by x ↑ md and x ↓ mu, respectively. • Consider the sequence y defined by yn = xn−mdly , that is the signal x delayed by mdly. • In z-transform domain this can be expressed as
  • 6. Wavelets • Wavelets (little waves) are functions that are concentrated in time as well as in frequency around a certain point • we mostly mean a pair of functions:  scaling function φ  wavelet function Ψ • The self similarity (refinement condition) of the scaling function φ is bounded to a filter h and is defined by • which means that φ remains unchanged if you filter it with h, downsample it by a factor of two, and amplify the values by , successively2
  • 7. Refinement condition of the scaling function – In step a the scaling function is duplicated, translated and scaled in abscissa. In step b the translated and scaled duplicates are amplified. Wavelets
  • 8. • The wavelet function Ψ is built on φ with help of a filter g • φ and Ψ are uniquely determined by the filters h and g Wavelets
  • 9. • φ and Ψ are uniquely determined by the filters h and g • Variants of these functions are defined, which are translated by an integer, compressed by a power of two, and usually amplified by a power of • j denotes the scale – the bigger j the higher the frequency and the thinner the wavelet peak • l denotes the translation – the bigger l the more shift to the right, and the bigger j the smaller the steps 2 Wavelets
  • 10. Discrete Wavelet Transforms • The goal is to represent signals as linear combinations of - wavelet functions at several scales and - scaling functions of the widest required scale (i.e. j=J) • The coefficient c1−J,l and coefficient dj,l for −J < j ≤ 0 describe the transformed signal we want to feed into a compression routine. • J corresponds to the number of different scales we can represent, which is equal to the number of transform levels. • The bigger J the more coarse structures can be described. course J=4 j=-3, c-3,l , d-3,l j=-2, c-2,l , d-2,l j=-1, c-1,l , d-1,l j=0, c0,l , d0,l
  • 11. A basis consisting of scaling and wavelet functions of the CDF(2,2) wavelet – This basis covers three levels of wavelet functions. Only a finite clip of translates is displayed Discrete Wavelet Transforms
  • 12.  For even l, cj,l depends only on hk and gk with even k  For odd l, cj,l depends only on hk and gk with odd k. • This is the reason why we will split both g and h in its even and odd indexed coefficients for most of our investigations. • It is easy to see that the conversion from wavelet coefficients to signal values is possible without knowing φ or Ψ , the only information needed, are the filters which belong to them. • Under certain conditions, the same is true for the reverse conversion. • It allows us to limit our view to the filters g and h and hide the functions . • Thus the computation of this change in representation can be made with the use of filters. Discrete Wavelet Transforms
  • 13. • wavelet analysis or wavelet decomposition : conversion from the original signal to the wavelet coefficients • wavelet synthesis or wavelet reconstruction : conversion from the wavelet coefficients back to the signal or an approximated version of it. • the analysis scaling and wavelet functions • the synthesis scaling and wavelet functions • The corresponding filters are denoted accordingly by and  : : and  ,g and h : : ,g and h Discrete Wavelet Transforms
  • 14. One level of wavelet transform expressed using filters Low pass Low pass High pass High pass as coarse version of the signal x at half resolution the differences or details that are necessary to reconstruct the original signal x from the coarse version. to discard even and odd indices after filtering Discrete Wavelet Transforms
  • 15.
  • 16. tree levels of wavelet transform Level 2 j=-2, c-2,l , d-2,l Level 1 j=-1, c-1,l , d-1,l Level 0 j=0, c0,l , d0,l Discrete Wavelet Transforms
  • 17. Cohen-Daubechies-Feauveau CDF(2,2) Wavelet (biorthogonal (5,3) wavelet) • Filter length of 5 and 3 for the low and high pass filters. • The filters, as well as the scaling and wavelet functions for decomposition and reconstructions, are symmetric. • A symmetric filter f always has odd filter length and it holds that fa+k = fb−k a and b are the smallest and greatest index l, respectively, where fl is different from zero. Symmetry is a very important property if we consider image compression, because in the absence of symmetry artifacts are introduced around edges.
  • 20. The relation between the regularity of the synthesis wavelet and the number of vanishing moments of the analysis wavelet A biorthogonal wavelet has m vanishing moments if and only if its dual scaling function generates polynomials up to degree m. In other words, vanishing moments tend to reduce the number of significant wavelet coefficients and thus, one should select a wavelet with many of them for the analysis On the other hand, regular or smooth synthesis wavelets give good approximations, if not all coefficients are used for reconstruction, as it is the case for lossy compression. To increase the number of vanishing moments of the decomposition wavelet one has to enlarge the filter length of the corresponding analysis low and high pass filters. That is, you have a trade off between filter length and number of vanishing moments of the decomposition wavelet. In terms of image compression you can improve the compression performance at the expense of increasing computational power to calculate the filter operations. .
  • 21. Lifting Scheme • An alternative computation method of the discrete wavelet transform. • In order to be consistent we base our introduction to Lifting on the CDF(2,2) wavelet, which is taken as explanation example too. The Lifting Scheme is composed of three steps, namely: 1- Split (also called Lazy Wavelet Transform) (to split the input signal x into even and odd indexed samples) 2- Predict (to predict the odd samples based on the evens) 3- Update (to ensure that the average of the signal is preserved)
  • 22. even odd The odd samples are replaced by the old ones minus the prediction. detail coefficients the coarser version of the input sequence at half the resolution approximation to ensure that the average of the signal is preserved Lifting Scheme
  • 23. With z-transform notation we could express the split and merge step using downsampling, delay, and upsampling, respectively. Lifting Scheme
  • 24. The predict and update steps for the CDF(2,2) wavelet • Here the predictor is chosen to be linear. • In that case all detail coefficients will be zero. • Therefore we have What are the advantages of this method ? 1. The most important fact is that we do not throw away already computed coefficients as in the filter bank approach. 2. It is also remarkable, that the wavelet transform can now be computed in place. This means, that given a finite length signal with n samples we need exactly n memory cells, each of them capable to store one sample, to compute the transform. 3. Furthermore we reduce the number of operations in order to compute the coefficients of the next coarser or finer scale, respectively. For the CDF(2,2)-wavelet we save three operations using the Lifting Scheme in comparison with the traditional filter bank approach. for the prediction step for the update step
  • 25. Integer-to-Integer Mapping • Obviously, the application of the filter bank approach or the Lifting Scheme leads to coefficients, which are not integers in general. In the field of hardware image compression it would be convenient, that coefficients and the pixel of the reconstructed image are integers too. • For the special case of the CDF(2,2) wavelet we therefore use the prediction and update steps as follows • As a consequence the coefficients of all scales −J < j ≤ 0 can be stored as integers and for all operations integer arithmetic is sufficient. Note, that the coarser the scale the more bits are necessary to store the corresponding coefficients. To overcome the growing bitwidth at coarser scales modular arithmetic can be used in the case of lossless compression.
  • 26. Lifting for for the CDF(2,2) wavelet Scheme after Integer-to-Integer Mapping
  • 28. Wavelet transforms on images • To transform images we can use two dimensional wavelets or apply the one dimensional transform to the rows and columns of the image successively as separable two dimensional transform. images interpretation as two dimensional array I
  • 29. : The image pixels :The wavelet transformed image :The coefficients Wavelet transforms on images
  • 30. Reflection at Image Boundary A row of an image r = (r0, . . . , rN−1) In order to convolve such a row r with a filter f we have to extend it to infinity in both directions. Extended row
  • 31. Reflection at Image Boundary There are several choices to choose the values of r’k from outside the interval [0,N −1]. The most popular one’s are • padding with zeros, • periodic extension, or • symmetric extension.
  • 33.
  • 34. 2D-DWT After one level of transform we obtain N/2 coefficients c0,l and N/2 coefficients d0,k These are given in interleaved order, that is Because of the split in odd and even indexed positions in the Lifting Scheme. Usually is rearranged to
  • 36. • Since we have restricted the images to be of quadratic size we can perform at most l = log2 N levels of transform. 2D-DWT
  • 38. • In order to preserve the average of a one dimensional signal, or the average of the brightness of images, we have to consider the normalization factors after the wavelet transform has taken place Normalization factors of the CDF(2,2) wavelet in two dimension for each level l, 0<= l < 5 2D-DWT
  • 40. Lena low + high-pass subsampled
  • 41. 1-level 2-D wavelet decomposition
  • 42. 2-level 2-D wavelet decomposition
  • 43. 3-level 2-D wavelet decomposition
  • 45. Diagrammatic representation of the dyadic decomposition for three decomposition levels
  • 46. The in-place mapping scheme. The dyadic decomposition is applied on a hypothetical 8 8 original image.
  • 47. State of the art Image Compression Techniques
  • 48. • In 1993, Shapiro has presented an efficient method to compress wavelet transformed images (embedded zerotree wavelet (EZW) encoder) • This EZW encoder exploits the properties of the multi-scale representation. • An significant improvement of this central idea was introduced by Said and Pearlman in 1996 (Set Partitioning In Hierarchical Trees (SPIHT) (pronounced: spite) Shapiro’s Algorithm (EZW)
  • 49. A Multi-resolution Analysis Example Lower octave has higher resolution and contains higher frequency information Shapiro’s Algorithm (EZW)
  • 50. Parent-Child Relationship of the LL Subband Shapiro’s Algorithm (EZW)
  • 51. Tree Structure of Wavelet Coefficients Parent: – Coefficient at the coarse scale is called parent Children: – All coefficients corresponding to the same spatial location at the next finer scale of similar orientation Descendants: – For a given parent, the set of all coefficients at all finer scale of similar orientation, corresponding to the same location • Every coefficient at a given scale can be related to a set of coefficients at the next finer scale of similar orientation. Shapiro’s Algorithm (EZW) Parent Children
  • 52. Hierarchical trees in multi-level decomposition
  • 53. coefficients that are in the same spatial location consist of a quad-tree. Shapiro’s Algorithm (EZW)
  • 54. • E – The EZW encoder is based on progressive encoding. Progressive encoding is also known as embedded encoding • Z – A data structure called zero-tree is used in EZW algorithm to encode the data • W – The EZW encoder is specially designed to use with wavelet transform. It was originally designed to operate on images (2-D signals) Shapiro’s Algorithm (EZW)
  • 55. • A kind of bitplane coding. • The kth bits of the coefficients constitute a bitplane. • A bitplane encoder starts coding with the most significant bit of each coefficient. • Within a bitplane the bits of the coefficients with largest magnitude come first. The coefficients are shown in decreasing order from left to right. Each coefficient is represented with eight bit, where the least significant bit is in front. N MSB LSB M Sign coding order Shapiro’s Algorithm (EZW)
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67. • self similarities between different scales which result from the recursive application of the wavelet transform step to the low frequency band Shapiro’s Algorithm (EZW)
  • 68. • Shapiro proposes to scan the samples from left to right and from top to bottom within each sub band, starting in the upper left corner. • The sub bands LH, HL, and HH at each scales are scanned in that order. • Furthermore, in contrast to traditional bitplane coders he has introduced data dependent examination of the coefficients. • The idea behind is, if there are large areas with unimportant samples in terms of compression, they should be excluded from exploration. The addressed self similarities are the key to perform such exclusions of large areas. Shapiro’s Algorithm (EZW)
  • 69. • In order to exploit the self similarities during the coding process, oriented trees of outdegree four are taken for the representation of a wavelet transformed image. • Each node of the trees represents a coefficient of the transformed image. • The levels of the trees consist of coefficients at the same scale. • The trees are rooted at the lowest frequency subbands of the representation. Each coefficient in the LH, HL, and HH subbands of each scale has four children The coefficients at the highest frequency subbands have no children There is only one coefficient in the lowest frequency band (DC coefficient) that has three children Oriented quad tree’s, four transform level, N = 16 Shapiro’s Algorithm (EZW)
  • 70. A coefficient of the wavelet transformed image is insignificant with respect to a threshold th if its magnitude (|c|) is smaller than Otherwise it is called significant with respect to the threshold th.  Dominant pass In the dominant pass, the coefficients are scanned in raster order (from left to right and from top to bottom) within the quadrants. The scan starts with the quadrants of the highest transform level. • In each transform level the quadrants are scanned in the order HL, LH, and HH. The coefficients are coded by symbol P, N, ZTR, or IZ. A coefficient is coded by: P, if it is greater than the given threshold and is positive N, if its absolute value is greater than the given threshold and it is negative ZTR, if its absolute value is smaller than the given threshold and the absolute value of all coefficients in the corresponding quad tree are smaller than the threshold, too. (zero tree root) IZ, if its absolute value is smaller than the given threshold and there exists at least one coefficient in the corresponding quad tree that is greater than the given threshold with respect to the absolute value.(isolated zero) Shapiro’s Algorithm (EZW) significantinsignificant
  • 71. Z :  It is useed within the high frequency bands of level one only, because all coefficients in these quadrants could not be root of a zerotree.  It can thus be seen as the combination of ZTR and IZ for this special case. Once a coefficient is encoded as the symbol P or N it is not included in the determination of zerotrees.  Subordinate pass: • Each coefficient, that has been coded as P or N in the previous dominant pass, is now refined, while coding the th-bit of its binary representation. • This corresponds to a bitplane coding, where the coefficients are refined in data dependent manner. • The most important fact hereby is, that no indices of the coefficients under consideration have to be coded. This is done implicitly, due to the order in which they become significant (coded as P or N in the dominant pass). Shapiro’s Algorithm (EZW)
  • 72. Example: encoding the wavelet transformed image given in Figure using the embedded zerotree wavelet algorithm of Shapiro the subband HL is excluded and the subband LH is scanned only partially. Shapiro’s Algorithm (EZW)
  • 73. The EZW algorithm is based on two observations: – Natural images in general have a low pass spectrum. When an image is wavelet transformed, the energy in the sub-bands decreases with the scale goes lower (low scale means high resolution), so the wavelet coefficient will, on average, be smaller in the lower levels than in the higher levels. – Large wavelet coefficients are more important than small wavelet coefficients. 631 544 86 10 -7 29 55 -54 730 655 -13 30 -12 44 41 32 19 23 37 17 -4 –13 -13 39 25 -49 32 -4 9 -23 -17 -35 32 -10 56 -22 -7 -25 40 -10 6 34 -44 4 13 -12 21 24 -12 -2 -8 -24 -42 9 -21 45 13 -3 -16 -15 31 -11 -10 -17 typical wavelet coefficients for a 8*8 block in a real image EZW – basic concepts higher levels
  • 74. The observations give rise to the basic progressive coding idea: 1. We can set a threshold T, if the wavelet coefficient is larger than T, then encode it as 1, otherwise we code it as 0. 2. ‘1’ will be reconstructed as T (or a number larger than T) and ‘0’ will be reconstructed as 0. 3. We then decrease T to a lower value, repeat 1 and 2. So we get finer and finer reconstructed data. The actual implementation of EZA algorithm should consider : 1. What should we do to the sign of the coefficients. (positive or negative) ? – answer: use POS (P) and NEG (N) 2. Can we code the ‘0’s more efficiently? -- answer: zero-tree 3. How to decide the threshold T and how to reconstruct? –answer: see the algorithm EZW – basic concepts
  • 75. coefficients that are in the same spatial location consist of a quad-tree. • The definition of the zero-tree: There are coefficients in different subbands that represent the same spatial location in the image and this spatial relation can be depicted by a quad tree except for the root node at top left corner representing the DC coeeficient which only has three children nodes. • Zero-tree Hypothesis If a wavelet coefficient c at a coarse scale is insignificant with respect to a given threshold T, i.e. |c|<T then all wavelet coefficients of the same orientation at finer scales are also likely to be insignificant with respect to T. EZW – basic concepts
  • 76. First step: The DWT of the entire 2-D image will be computed by FWT Second step: Progressively EZW encodes the coefficients by decreasing the threshold Third step: Arithmetic coding is used to entropy code the symbols EZW – basic concepts
  • 77. Here MAX() means the maximum coefficient value in the image and y(x,y) denotes the coefficient. With this threshold we enter the main coding loop threshold = initial_threshold; do { dominant_pass(image); subordinate_pass(image); threshold = threshold/2; } while (threshold > minimum_threshold); The main loop ends when the threshold reaches a minimum value, which could be specified to control the encoding performance, a “0” minimum value gives the lossless reconstruction of the image. The initial threshold t0 is decided as: Second step: Progressively EZW encodes the coefficients by decreasing the threshold EZW – basic concepts
  • 78. In the dominant_pass • All the coefficients are scanned in a special order • If the coefficient is a zero tree root, it will be encoded as ZTR. All its descendants don’t need to be encoded – they will be reconstructed as zero at this threshold level • If the coefficient itself is insignificant but one of its descendants is significant, it is encoded as IZ (isolated zero). • If the coefficient is significant then it is encoded as POS (P) or NEG (N) depends on its sign. This encoding of the zero tree produces significant compression because gray level images resulting from natural sources typically result in DWTs with many ZTR symbols. Each ZTR indicates that no more bits are needed for encoding the descendants of the corresponding coefficient EZW – basic concepts
  • 79. At the end of dominant_pass • all the coefficients that are in absolute value larger than the current threshold are extracted and placed without their sign on the subordinate list and their positions in the image are filled with zeroes. This will prevent them from being coded again. In the subordinate_pass • All the values in the subordinate list are refined. this gives rise to some juggling with uncertainty intervals and it outputs next most significant bit of all the coefficients in the subordinate list. EZW – basic concepts
  • 80. Wavelet coefficients for a 8*8 block EZW – an example
  • 81. The initial threshold is 32 and the result from the dominant_pass is shown in the figure Data without any symbol is a node in the zero-tree. 63 POS -34 NEG 49 POS 10 ZTR 7 IZ 13 IZ -12 7 -31 IZ 23 ZTR 14 ZTR -13 ZTR 3 IZ 4 IZ 6 -1 15 ZTR 14 IZ 3 -12 5 -7 3 9 -9 ZTR -7 ZTR -14 8 4 -2 3 2 --5 9 -1 IZ 47 POS 4 6 -2 2 3 0 -3 IZ 2 IZ 3 -2 0 4 2 -3 6 -4 3 6 3 6 5 11 5 6 0 3 -4 4 EZW – an example
  • 82. The initial threshold is 32 and the result from the dominant_pass is shown in the figure Data without any symbol is a node in the zero-tree. 63 POS -34 NEG 49 POS 10 ZTR 7 IZ 13 IZ -12 7 -31 IZ 23 ZTR 14 ZTR -13 ZTR 3 IZ 4 IZ 6 -1 15 ZTR 14 IZ 3 -12 5 -7 3 9 -9 ZTR -7 ZTR -14 8 4 -2 3 2 --5 9 -1 IZ 47 POS 4 6 -2 2 3 0 -3 IZ 2 IZ 3 -2 0 4 2 -3 6 -4 3 6 3 6 5 11 5 6 0 3 -4 4 EZW – an example
  • 83. The initial threshold is 32 and the result from the dominant_pass is shown in the figure Data without any symbol is a node in the zero-tree. 63 POS -34 NEG 49 POS 10 ZTR 7 IZ 13 IZ -12 7 -31 IZ 23 ZTR 14 ZTR -13 ZTR 3 IZ 4 IZ 6 -1 15 ZTR 14 IZ 3 -12 5 -7 3 9 -9 ZTR -7 ZTR -14 8 4 -2 3 2 --5 9 -1 IZ 47 POS 4 6 -2 2 3 0 -3 IZ 2 IZ 3 -2 0 4 2 -3 6 -4 3 6 3 6 5 11 5 6 0 3 -4 4 Significance Map EZW – an example
  • 84. The result from the dominant_pass is output as the following: D1: POS, NEG, IZ, ZTR, POS, ZTR, ZTR, ZTR, ZTR, IZ, ZTR, ZTR, IZ, IZ, IZ, IZ, IZ, POS, IZ, IZ The significant coefficients are put in a subordinate list and are refined. A one-bit symbol is output to the decoder. Original data 63 34 49 47 Output symbol (S1) 1 0 1 0 Reconstructed data 56 40 56 40 For example, the output for 63 is: sign 32 16 8 4 2 1 0 1 1 ? ? ? ? If T+.5T is less than data item take the average of 2T and 1.5T, put a 1 in the code. So 63 will be reconstructed as the average of 48 and 64 which is 56. If it is more than T+.5T , put a 0 in the code and encode this as T+.5T+.25T. Thus, 34 is reconstructed as 40. T+.5T=48 (2T+1.5T)/2=56 T+.5T+.25T=40 EZW – an example
  • 85. * * * 10 7 13 -12 7 -31 23 14 -13 3 4 6 -1 15 14 3 -12 5 -7 3 9 -9 -7 -14 8 4 -2 3 2 --5 9 -1 * 4 6 -2 2 3 0 -3 2 3 -2 0 4 2 -3 6 -4 3 6 3 6 5 11 5 6 0 3 -4 4 After dominant pass, the significant coefficients will be replaced by * or 0 Then the threshold is divided by 2, so we have 16 as current threshold Significance Map EZW – an example
  • 86. The result from the second dominant pass is output as the following: D2: IZ, ZTR, NEG, POS, IZ,IZ, IZ, IZ, IZ, IZ, IZ, IZ The significant coefficients are put in the subordinate list and all data in this list will be refined as: Original data 63 34 49 47 31 23 Output symbol 1 0 0 1 1 0 Reconstructed data 60 36 52 44 28 20 For example, the output for 63 is: sign 32 16 8 4 2 1 0 1 1 1 ? ? ? The computatin is now extended with respect to the next significant bit. So 63 will be reconstructed as the average of 56 and 64 –- 60! EZW – an example
  • 87. The process is going on until threshold =1, the final output as: p=pos, n=neg, z=iz, t=ztr , Di= i’th dominant pass, Si= i’th dominant pass output symbol D1: pnztpttttztttttttptt S1: 1010 D2: ztnptttttttt S2: 100110 D3: zzzzzppnppnttnnptpttnttttttttptttptttttttttptttttttttttt S3: 10011101111011011000 D4: zzzzzzztztznzzzzpttptpptpnptntttttptpnpppptttttptptttpnp S4: 11011111011001000001110110100010010101100 D5: zzzzztzzzzztpzzzttpttttnptppttptttnppnttttpnnpttpttppttt S5: 10111100110100010111110101101100100000000110110110011000111 D6: zzzttztttztttttnnttt For example, the output for 63 is: sign 32 16 8 4 2 1 0 1 1 1 1 1 1 So 63 will be reconstructed as 32+16+8+4+2+1=63! Note, how progressive transmission can be done. EZW – an example
  • 88. 112 1 -2 5 -2 -2 8 3 23 3-430 6264 4 1 02 -3 45 -14 1069-38 27-31 19 17 3 -15 743 8-20156 6 -4 7 -3 3 0 -2 -8 -1 -1 -64 5 7-19-1-9 5 13 113 4 -9 -2 -2 8 3 23 -14 1069-38 743 8-20156 55 13 -9 -2 3-430 6264 4 1 02 -3 45 27 3 -15 7-19 112 1 5 -31 19 17 6 -4 7 -3 3 0 -2 -8 -1 -1 -64 -1-9 113 4 SPIHT- Set Partitioning In Hierarchical Trees
  • 89. Said and Pearlman have significantly improved the codec of Shapiro. The main idea is based on partitioning of sets, which consists of coefficients or representatives of whole subtrees. They classify the coefficients of a wavelet transformed image in three sets:  LIP: list of insignificant pixels which contains the coordinates of those coefficients which are insignificant with respect to the current threshold th.  LSP: list of significant pixels which contains the coordinates of those coefficients which are significant with respect to th.  LIS: list of insignificant sets which contains the coordinates of the roots of insignificant subtrees. During the compression procedure, the sets of coefficients in LIS are refined and if coefficients become significant they are moved from LIP to LSP. SPIHT- Set Partitioning In Hierarchical Trees
  • 90. • The first difference to Shapiro’s EZW algorithm is the distinct definition of the significance. Here, the root of the tree is excluded from the computation of the significance attribute. Set O(i, j), D(i, j) and L(i, j): offspring SPIHT- Set Partitioning In Hierarchical Trees
  • 91. Ex: the sets O(i, j), D(i, j) and L(i, j) with i = 1 and j = 1, the labels O, D, L show, that this coefficients is a member of the corresponding set, (N = 8) SPIHT- Set Partitioning In Hierarchical Trees H
  • 92. TYPE A TYPE B SPIHT- Set Partitioning In Hierarchical Trees
  • 93. In the SPIHT algorithm the signification is computed for the sets D(i, j) and L(i, j). The root of each quadtree is, in contrast to the algorithm presented by Shapiro, not included in the computation of the significance. Significance Attribute SPIHT- Set Partitioning In Hierarchical Trees
  • 94. Parent-Child Relationship of the LL Subband different parent-child relationships of the LL band SPIHT- Set Partitioning In Hierarchical Trees
  • 95. SPIHT- Set Partitioning In Hierarchical Trees
  • 96. O(i,j): set of coordinates of all offspring of node (i,j); children only D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc. L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc. SPIHT Sorting Pass SPIHT- Set Partitioning In Hierarchical Trees
  • 97. O(i,j): set of coordinates of all offspring of node (i,j); children only D (i,j): set of coordinates of all descendants of node (i,j); children, grandchildren, great-grand, etc. H (i,j): set of all tree roots (nodes in the highest pyramid level); parents L (i,j): D (i,j) – O(i,j) (all descendents except the offspring); grandchildren, great-grand, etc. SPIHT Refinement Pass SPIHT- Set Partitioning In Hierarchical Trees The refinement phase outputs the kth bit of each elements of list LSP, if it was not included in the last sorting phase.
  • 98. Example of SPIHT All roots D’s of roots SPIHT- Set Partitioning In Hierarchical Trees
  • 99. n=4; for n=3 SPIHT- Set Partitioning In Hierarchical Trees
  • 100. SPIHT- Set Partitioning In Hierarchical Trees
  • 101. n=2; SPIHT- Set Partitioning In Hierarchical Trees
  • 102. n=2; SPIHT- Set Partitioning In Hierarchical Trees
  • 103. 5 -16 7-12 23 93 -24 -7 -5 9 3 2 2-3 47-1 0 115 -3 65 -46 63 -34 -31 15 -1314 1049 23 8-14 -123 -7-9 14 4 6 3 3 -2 4-4 63 30 6 7 13 3 4 40 2-2 SPIHT- Set Partitioning In Hierarchical Trees Example of 3-scale wavelet transform of an 8 by 8 image.
  • 104. 5 -16 7-12 23 93 -24 -7 -5 9 3 2 2-3 47-1 0 115 -3 65 -46 63 -34 -31 15 -1314 1049 23 8-14 -123 -7-9 14 4 6 3 3 -2 4-4 63 30 6 7 13 3 4 40 2-2 SPIHT- Set Partitioning In Hierarchical Trees Example of 3-scale wavelet transform of an 8 by 8 image.
  • 105. 0) Initialization LSP←{} LIP ←{(63,-34,-31,23)} LIS ←{(-34,-31,23)} 1) n=5 (T=25=32) LSP←{(63,-34,49,47)} LIP ←{(-31,23),(10,14,-13), (15,14,-9,-7),(-1,-3,2)} LIS ←{(23),-34B,(15,-9,-7)} output:5,1+1-0011+000010000100101+0000 decoding 0 00 00 00 00 00 0 0 0 0 0 00 320 0 00 0 00 00 32 -32 0 0 00 032 0 00 00 00 0 0 0 0 0 0 00 00 00 0 0 0 0 0 00 00 SPIHT- Set Partitioning In Hierarchical Trees 5 -16 7-12 23 93 -24 -7 -5 9 3 2 2-3 47-1 0 115 -3 65 -46 63 -34 -31 15 -1314 1049 23 8-14 -123 -7-9 14 4 6 3 3 -2 4-4 63 30 6 7 13 3 4 40 2-2
  • 106. 2) n=4 (T=24=16) LSP←{(63,-34,49,47),(-31,23)} LIP ←{(10,14,-13),(15,14,-9,-7),(-1,-3,2)} LIS ←{(23),-34B,(15,-9,-7)} output: sorting pass: 4,1-1+000000000000000 refinement pass: 1010 decoding 0 00 00 00 00 00 0 0 0 0 0 00 320 0 00 0 00 00 48 -32 -16 0 00 048 16 00 00 00 0 0 0 0 0 0 00 00 00 0 0 0 0 0 00 00 SPIHT- Set Partitioning In Hierarchical Trees
  • 107. 3) n=3 (T=23=8) LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11, 13,-12,9)} LIP ←{(-7),(-1,-3,2),(3),(-5,3,0),(2,-3,5),(7,3,4),(7,6,-1),(3,3,2)} LIS ←{(-7),23B,(14)} output: sorting pass: 3,1+1+1-1+1+1-0000101-1-1+01 101+0010001+0101+0011-0000 101+00 refinement pass: 100110 0 00 0-8 00 80 00 0 0 8 0 0 00 400 0 80 0 00 00 56 -32 -24 8 -88 848 16 8-8 -80 0-8 8 0 0 0 0 0 00 00 00 0 0 8 0 0 00 00 decoding SPIHT- Set Partitioning In Hierarchical Trees
  • 108. 4) n=2 (T=22=4) LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11, 13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4)} LIP ←{(-1,-3,2),(3),(3,0),(2,-3),(3),(-1),(3,3,2),(-2),(3,-2),(-2,2,0), (3,0,3),(3)} LIS ←{} output: sorting pass: 2,1-00001-00001+1+01+1+1+0000 11+1-1+1+111+1-1+011+1+00100 01+101+00101+1-1+ refinement pass: 10011101111011000110 4 04 4-12 00 80 04 -4 -4 8 0 0 00 440 0 84 0 44 -44 60 -32 -28 12 -1212 848 20 8-12 -120 -4-8 12 4 4 0 0 0 4-4 40 00 4 4 12 0 4 40 00 decoding SPIHT- Set Partitioning In Hierarchical Trees
  • 109. 5) n=1 (T=21=2) LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11, 13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4), (-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3)} LIP ←{(-1),(0),(-1),(0),(0)} LIS ←{} output: sorting pass: 1,01-1+1+1+01+1-1+01+1+1+1-1+ 1-1-1+01+01+1+ refinement pass: 1101111101100100100010010 1110010100101100 decoding 4 06 6-12 22 82 -24 -6 -4 8 2 2 2-2 460 0 104 -2 64 -46 62 -34 -30 14 -1214 1048 22 8-14 -122 -6-8 14 4 6 2 2 -2 4-4 62 20 6 6 12 2 4 40 2-2 SPIHT- Set Partitioning In Hierarchical Trees
  • 110. 6) n=0 (T=20=1) LSP←{(63,-34,49,47),(-31,23),(10,14,-13,15,14,-9,-12,-14,8,9,11, 13,-12,9),(-7,-5,5,7,4,7,6,6,-4,5,6,5,-7,4,4,6,4,6,6,-4,4), (-3,2,3,3,2,-3,3,3,3,2,-2,3,-2,-2,2,3,3,3),(-1,-1)} LIP ←{(0),(0),(0)} LIS ←{} output: sorting pass: 0,1-01-00 refinement pass: 1011110011010001110111110 1000101100000000101101111 001000111 decoding 5 -16 7-12 23 93 -24 -7 -5 9 3 2 2-3 47-1 0 115 -3 65 -46 63 -34 -31 15 -1314 1049 23 8-14 -123 -7-9 14 4 6 3 3 -2 4-4 63 30 6 7 13 3 4 40 2-2 SPIHT- Set Partitioning In Hierarchical Trees
  • 111. The basic Algorithm LIP: List of Insignificant Pixel LIS: List of Insignificant Set LSP: List of Significant Pixel SPIHT- Set Partitioning In Hierarchical Trees All roots D’s of roots
  • 113. SPIHT- Set Partitioning In Hierarchical Trees
  • 115. What are the advantages and disadvantages of this partitioned approach? + The internal memory requirements are dramatically reduced. + The subimages can be transformed independently of each other. + The traditional 2D-DWT can be directly applied to the subimages without any modifications. − It is not applicable for lossy compression.
  • 116.
  • 117. Boundary Treatment which pixels have to be considered in order to compute the low and high pass coefficients using the CDF(2,2)? wavelet partitioned wavelet transform without boundary treatment introduces block artifacts targeting to lossy compression
  • 118. the computation of the low pass coefficients and their dependences on the coefficients of the previous scales area of coefficients, which have to be incorporated into the computation of the transform of the partition under consideration
  • 119. Modifications to the SPIHT Codec In this section we present the modifications necessary to obtain an efficient hardware implementation of the SPIHT compressor based on the partitioned approach to wavelet transform images.  At first, we exchange the sorting phase with the refinement phase to save memory for status information.  The greatest challenge is the hardware implementation of the three lists LIP, LSP, and LIS.
  • 120. Exchange of Sorting and Refinement Phase In the basic SPIHT algorithm status information has to be stored for the elements of LSP specifying whether the corresponding coefficient has been added to LSP in the current iteration of the sorting phase (see Line 20). In the worst case all coefficients become significant in the same iteration. Consequently, we have to provide a memory capacity of q^2 bits to store this information. However, if we exchange the sorting and the refinement phase, we do not need to consider this information anymore. The compressed data stream is still decodable and it is not increased in size. Of course, we have to consider that there is a reordering of the transmitted bits during an iteration.
  • 121. Memory Requirements of the Ordered Lists To obtain an efficient realization of the lists LIP, LSP, and LIS, we first have to specify the operations that take place on these lists and deduce worst case space requirements in a software implementation. Estimations for LIS We have to provide the following operations for LIS: • initialize as empty list (Line 0), • append an element (Line 0 and 19), • sequentially iterate the elements (Line 5), • delete the element under consideration (Line 15 and 19), • move the element under consideration to the end of the list and change the type (Line 14).
  • 122. LIS contains at most elements. To specify the coordinates of an element, 2 log2 (q/2) bits are needed. A further bit to code the type information has to be added per element. This results in an overall space requirement for LIS of bits.
  • 123. Estimations for LIP and LSP Now, let us consider both the list LIP and the list LSP because they can be implemented together. Again, we start with the operations applied to both lists: • initialize as empty list (Line 0), • append an element (Line 0, 4, 11, and 12), • sequentially iterate the elements (Line 20), • delete the element under consideration from LIP (Line 4). the overall space requirement for both lists is q^2 · 2 log2 q bits.
  • 124.
  • 126. Prototyping Environment The prototyping environment provides several mechanism to exchange data between the mounted Xilinx device, the local SRAM and the PC main memory. These could be categorized into: 1. direct access to registers/latches (configured into the FPGA) 2. access to memory cells of the local SRAM (all data transfers are going through the FPGA) 3. DMA transfers and DMA on demand transfers 4. interrupts. To communicate with the PCI card one has to write a simple C/C++ program, which initializes the card, configures the FPGA, set the clock rate and starts the data transfer or the computation.
  • 127. • The Xilinx XC4085 device consists of a matrix of Configurable Logic Blocks (CLB) with 56 rows and 56 columns. • These CLBs are SRAM based and can be configured many times. • After shut down the power supply they have to be reconfigured. • It can be configured to represent any function with 4 or 5 inputs (function generator F, G, or H). • there are mainly two flip-flops, two tristate drivers and the so called carry-logic available. The Xilinx XC4085 XLA device
  • 128. The routing is done using programmable switch matrices (PSM) (a) interconnections between the CLBs (b) switch matrices in XC4085XLA • Each CLB contains hard-wired carry logic to accelerate arithmetic operations. • Fast adders can be constructed as simple ripple carry adders, using the special carry logic to calculate and propagate the carry between full adders in adjacent CLBs. One CLB is capable of realizing two full adders. The Xilinx XC4085 XLA device
  • 129. • Each CLB can be configured to provide internal RAM. • In order to do this, conceive the four input signals to F and G as address lines. • Thus a CLB provides two 16×1 or one 32×1 random access memory module, respectively. • At maximum we have 12kbyte RAM available. • Each RAM block can be configured with different behavior: •synchronous RAM: edge-triggered • asynchronous RAM: level-sensitive • single-port-RAM • dual-port-RAM The Xilinx XC4085 XLA device
  • 130. 2D-DWT FPGAArchitectures targeting Lossless Compression In the following all available architectures ( 4 architectures) are listed. • one Lifting unit – data transfer from the PC directly to the internal memory and vice versa – data transfer from the PC to the SRAM on the prototyping board vice versa • four Lifting units working in parallel – data transfer from the PC to internal memory and vice versa – data transfer from the PC to SRAM on the prototyping board vice versa
  • 131. The implementations with four Lifting units introduce parallelism with respect to the rows of a subimage. Once the FPGA is configured, a whole image can be transfered to the SRAM of the PCI card (maximum size 2Mbyte). Each subimage is then loaded into the internal memory, wavelet transformed, and the result is stored in the SRAM where it can be transfered back to the PC. Another choice is to write a subimage directly into the internal memory of the FPGA, start the transform, and read the wavelet transformed subimage back to the PC. The computation itself is started if an internal status register is set by the software interface. The RAM itself is further decomposed in smaller units to support the parallel execution of the four Lifting units. Its is capable to store a subimage of size 32×32. There exists two separate finite state machines to control the wavelet transform and its inverse, respectively. the global data path diagram of a circuit with four parallel working lifting units.
  • 132. 2D-DWT FPGAArchitectures targeting Lossy Compression we will distinguish between two different architectures for the partitioned discrete wavelet transform on images. I. 2D-DWT FPGAArchitecture based on Divide and Conquer Technique II. 2D-DWT FPGA Pipelined Architecture
  • 133. 2D-DWT FPGAArchitecture based on Divide and Conquer Technique in order to compute one level of CDF(2,2)-wavelet transform in one dimension we need two pixels to the left and one pixel to the right of each rows of an image, respectively. Thus, we append a row of an image by one and two memory cells on the right and on the left, respectively. We called such a enlarged row extended row. Let r be a row of an image of length 16. Now, the problem is split in two subproblems which are solved interlocked. The first module computes the ’inner’ coefficients c0,0, . . . , c0,15 and d0,0, . . . , d0,15 whichonly depend on the extended row. In order to proceed to the next level of the wavelet transform, the coefficients at appended positions have to be made topical, i.e., the coefficients c0,−2, c0,−1, and c0,16 have to be computed. The first module which is responsible for the inner coefficients computes the subtotals of c0,−2, c0,−1, and c0,16 that only depend on the extended row. The second module computes the subtotals of c0,−2, c0,−1, and c0,16 that depend on pixels not in row.
  • 134. 2D-DWT FPGA Pipelined Architecture • two one dimensional DWT units (1D-DWT) for horizontal and vertical transforms • a control unit as a finite state machine • an internal memory block To process a subimage, all rows are transfered to the FPGA over the PCI bus and transformed on the fly in the horizontal 1D-DWT unit using pipelining. The coefficients computed in this way are stored in internal memory of different types. The coefficients corresponding to the rows of the subimage itself are stored in single port RAM. Now the vertical transform levels can take place. This is done by the vertical 1D-DWT unit. The control unit coordinates these steps in order to process a whole subimage and is responsible for generating enable signals, address lines, and so on. At the end, the wavelet transformed subimage is available in the internal RAM. At this point an EZW algorithm can be applied to the multiscale representation of the subimage. Since all necessary boundary information was included in the computation, no block artefacts are introduced by the following quantization. control unit
  • 135. 4-level Horizontal 1D-DWT unit The whole horizontal transform is done for the 16 rows of the subimage under consideration. In addition, 30 rows of the neighboring subimage in the north and 15 rows of the southern subimage are transformed in the same manner. These additional computations are required by the vertical DWT applied next. This unit has to take four pixels of a row at each clock cycle and must perform 4 levels of horizontal transforms. the unit consists of four pipelined stages, one for each transform level high frequency coefficients low frequency coefficients high frequency coefficients low frequency coefficients second stage F=f0/2 first stage F =f0 even even even evenodd odd odd odd input for the third stage Third stage Forth stage alternately outputs a low or a high frequency coefficient at one clock cycle alternately outputs a low or a high frequency coefficient at one clock cycle
  • 136. Recall: Lifting for for the CDF(2,2) wavelet Scheme after Integer-to-Integer Mapping
  • 137. The w-bit input 1D-DWT unit (1i2o) To implement one level of DWT using the lifting method the following steps are necessary: • split the input into coefficients at odd and even positions • perform a predict-step, that is the operation given in • perform an update-step, that is the operation given in
  • 140. w-bit input 1D-DWT unit (accordingly to the lifting scheme) The unit consists of two register chains. The registers in the upper chain are enabled at even, the registers in the lower chain at odd clock edges. This splits the input into words at even and odd positions. Now the predict and update steps can be applied straightforward.
  • 141. The 4w-bit input 1D-DWT unit (4i4o) takes four pixels of the same row (i) at a time We use this unit for both the first and the second level of the transform
  • 142. • The internal memory for the wavelet coefficients is capable to store 16×16 coefficients. • Since the bitwidth of the coefficients differs with their corresponding subbands the memory block consists of 5 slices. • InFigure (a) we have shown once again the minimal affordable bitwidth for each subbands. • The structure of the internal memory is illustrated in Figure (b). Internal Coefficient Memory
  • 143. FPGA-Implementation of the Modified SPIHT Encoder To store wavelet coefficients LP := LIP U LSP SL and SD store the pre-computed significance attributes for all thresholds block diagram of modified SPIHT compressor
  • 144. size in bits of each RAM block In comparison to the algorithm SPIHT Image compression ,modified SPIHT Image compression could reduce the internal memory from To (N = 512, d0 = 11)
  • 145. • Each subimage of the wavelet transformed image is transferred once to the internal memory module named ’coeff’ or is already stored there. • At first, the initialization of the modules representing LIP, LSP, and LIS and the computation of the significances is done in parallel. • The lists LIP and LSP are managed by the module ’LP’, the bitmap of LIS by the module ’LIS’. • The significances of sets are computed for all thresholds th ≤ kmax at once and are stored in the modules named ’SL’ and ’SD’, respectively. • Here we distinguish between the significances for the sets L and D. • With this information the compression can be started with bit plane kmax. • Finite state machines control the overall procedure. • The data to be output is registered in module ’output’ from which it is put to the local SRAM on the PCI card on a 32 bit wide data bus. • Additionally, an arithmetic coder can be configured into that module. This further reduces the compression ratio. the overall functionality
  • 146. Hardware Implementation of the Lists • To reduce the memory requirement for the list data structures in the worst case, we implement the lists as bitmaps. • The bitmap of list L represents the characteristic function of list L.
  • 147. • The RAM module which realizes LIP and LSP has a configuration of q × q entries of bit length 2 as for each pixel of the q × q subimage either (i, j) LIP, (i, j) LSP, or (i, j) LIP LSP holds. • The second RAM module implements LIS.     possible configuration states of a coordinate (i, j) (0) (0) (0) (1) (1) (1) (1) Since none of the coefficients in these subbands can be the root of a zerotree, we have to provide a bitmap of size (q/2)^2 bits Coefficients can be of type B Coefficients always are of type A only for the area which corresponds to LL(1) the type information has to be stored. This results in additional (q/4)^2 bits
  • 148. Efficient Computation of Significances  Computing significance of an individual coefficient: The significance of an individual coefficient is trivial to compute. Just select the kth bit of |ci,j | in order to obtain Sk(i, j). This can be realized by using bit masks and a multiplexer.  Computing significance of sets for all thresholds in parallel: We define S*(T ) as  Thus, S*(T ) stands for the maximum threshold k for which some coefficient in T becomes significant.  Once S*(T ) is computed for all sets L and D, we have preprocessed the significances of sets for all thresholds. In order to do this, we use the two RAM modules SL and SD. They are organized as following memory, respectively. SL SD
  • 149. • The computation is done bottom up in the hierarchy defined by the spatial oriented trees. • The entries of both RAMs are initialized with zero. • Now, let (e, f) be a coordinate with 0 < e, f < q just handled by the bottom up process and let (i,j)= ([e/2],[f/2]) be the parent of (e, f) if it exists. Then SD and SL have to be updated by the following process
  • 150. • After reset we start the computation in state one and initialize k’max and the row and column indices e and f. • At this time SD(e, f) and SL(e, f) hold their old values from the last subimage under consideration for all 0 < e, f < q. • If the enable signal becomes active, we proceed in state 2. Here we buffer the present value of SD(e, f). • In the states 3, 4, 5, and 6 we compute line (6.1). • The condition e, f are odd checks, if we visit a 2 × 2 coefficient block for the first time. • The states 2, 5, and 6 are responsible for computing the maximum of SD(i, j) and S*(e, f) (line (6.2)), which is buffered in tS. • State 8 performs the assignment in line (6.3). Furthermore, this finite state machine updates the value k’ max = kmax + 1 • for the subimage under consideration. • In state 10 the low frequency coefficient at position (0, 0) will be included in this computation, too. • The operation in state 9 is done using a simple subtractor and a combined representation with interleaved bitorder of the row and column index e and f, that is fn−1, en−1, fn−2, en−2, . . . , f1, e1, f0, e0.
  • 153. Hardware platform The hardware platform used [WILDFORCE] is a PCI plug-in board with five Xilinx 4085 FPGAs, also referred to as PEs (Processing Elements). The board is stacked with five 1MB SRAM chips. Each of the five SRAM chips are directly connected to one of the five PEs. The embedded memory is accessible for read/write from both the host computer as well as from the corresponding PE. Each of the 1MB memory chip is organized as 262144 words of 32 bits each.
  • 154. Memory read/write • The input image : 512 by 512 pixels • Input frames are loaded to the embedded memory by the host computer and results are read back, once the PE has processed it. • The PE also uses the embedded memory as intermediate storage to hold results between different stages of processing. • Memory reads can be pipelined so that the effects of this latency is minimized.
  • 155. Design partitioning The whole computation is partitioned into two stages. The first stage: Computes discrete wavelet transform coefficients of the input image frame and writes it back to the embedded memory. The second stage: Operates on this result to complete the rest of the processing (dynamic quantization, zero thresholding, run length encoding for zeroes, and entropy encoding on the coefficients) The two stages are implemented on two separate FPGAs.
  • 156. Stage 1: Discrete Wavelet Transform (2, 2) wavelet: • A modified form of the Bi-orthogonal (2,2) Cohen-Debuchies-Feaveu wavelet filter is used. The analysis filter equations are shown below • The boundary conditions are handled by symmetric extension of the coefficients as shown below • The synthesis filter equations are shown below
  • 157. DWT in X and Y directions Coefficient ordering along X direction Coefficient ordering along Y direction
  • 158. Each pixel in the input frame is represented by 16 bits, accounting for 2 pixels per memory word. Thus, each memory read brings in two consecutive pixels of a row.
  • 159. 3 stages of wave-letting High pass and Low pass coefficients at stage 1, X direction
  • 160.
  • 161. Interleaved ordering along the 3 stages of wave-letting
  • 162. • Memory addressing is done with a pair of address registers - read and write address registers. Stage 1 architecture
  • 163. The difference between write and read registers is the latency of the pipelined data-flow blocks. The maximum and minimum coefficient values for each block (each quadrantin the multi stage wave-letting) are maintained on the FPGA. These values are written back to a known location in the lower half (lower 0.5MB) of the embedded memory. The second stage, uses these values for the dynamic quantization of the coefficients.
  • 164. Stage 2 Dynamic quantization • The coefficients from different sub-bands are quantized separately. The dynamic range of the coefficients for each sub-band (computed in first stage) is divided into 16 quantization levels. • The coefficients are quantized into one of the 16 possible levels. • The maximum and minimum value of the coefficients for each sub- band is also needed while decoding the image. as a binary search tree look up in hardware
  • 165. Zero thresholding and RLE on zeroes Stage 2 Different thresholds are used for different sub-bands, resulting in different resolution in different sub-bands.
  • 167. Entropy encoding • The encoding is implemented by two look-up tables on the FPGA. Given an eight bit input, the first look-up table (LUT), provides information about the size of encoding. The second LUT gives the actual encoding. • Only the relevant bits from the second LUT should be used. • The rest of the bits in the output are don’t care and are either chosen as logic 0 or 1 during logic optimization. Stage 2 Entropy encoder
  • 168. Entropy encoding -Bit packing: The output of the entropy encoder varies from 3 to 18 bits. The bits need to be packed into 32 bit words before being written back to the embedded memory. This is achieved by the shifter. This shifter is inspired from the Xtetris computer game and the binary search algorithm. The shifter consists of 5 register stages, each 32 bits wide. The input data can be shifted (rotated) by 16 or latched without shifting, to stage 1. The data can be shifted by 8 or passed on straight from stage 1 to stage 2. Similarly data can be shifted by 4, 2, and 1 when moving between the remaining stages. Data is shifted from stage to stage, and is accumulated at the last stage. When the last stage has 32 bits of data, a memory write is initiated and the last stage is flushed. Stage 2
  • 170. Output file format • At the end of the second stage, the upper memory (upper 0.5MB) contains the packed bit stream. The total count of the bit stream approximated to the nearest WORD is written to memory location 0. To reconstruct the data from the bit stream, the following information is needed.  The actual bit stream. On Huffman decoding, the actual 8 bit codes are retrieved. These codes are either the quantizer output, or the RLE count. On expanding the RLE count to the corresponding number of zeroes, we get the actual quantized stream.  The four quadrants of the final stage of wave-letting can be located at the first four 128*128 byte blocks. The three quadrants of the next stage can be located at at next three blocks sized at 256*256 bytes each. Each quadrant (sub-band) is quantized separately. The dynamic range of each of the quadrant should be known to reconstruct the original stream.  The output file written has all the information needed to reconstruct the image Stage 2 Outfile format
  • 171. Stage 2 Stage 2, data flow diagram Overall architecture
  • 172. Wavelet coefficients from memory are read from the lower half of the embedded memory. The block (sub-band) minimum and maximum is also read from the memory. The packed bit stream output is written to the upper memory, and the bit stream length is written to memory location 0. The control software, reads the embedded memory and generates the compressed image file. Before reading the wavelet coefficients, the maximum and minimum of coefficients in each sub-band are read from the lower memory. The coefficients are then read and processed for each sub-band, starting with the lowest frequency band. As shown in the state diagram, a memory read is fired in stage Read 001. Memory read has a latency of 2 clock cycles. The results of the read is finally available in state Read 100. Memory writes are completed in the same cycle. The two intermediate states, Read 010 and Write can be used to write back the output, if output is available. Each memory read brings in two wavelet coefficients. Consider the worst case, where the two coefficients gets expanded to 18 bits each. There are two memory write cycles before the next read. When ever a memory write is performed, the memory address register is incremented. The read address generators, read each sub-band from the interleaved memory pattern. The output is written as a continuous stream, starting with the lowest sub-band. Thus the output is effectively in Mallot ordering and can be progressively transmitted/decoded.
  • 173. Stage 2, control flow diagram
  • 174.
  • 175.
  • 176.
  • 177.