SlideShare a Scribd company logo
1 of 6
Download to read offline
www.ijmer.com

International Journal of Modern Engineering Research (IJMER)
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645

Efficient Implementation of Low Power 2-D DCT Architecture
1

Kalyan Chakravarthy. K , 2G.V.K.S.Prasad

1

2

M.Tech student, ECE, AKRG College of Engineering and Technology, Nallajerla, INDIA
Assistant professor, ECE, AKRG College of Engineering and Technology, Nallajerla, INDIA

ABSTRACT: This Research paper includes designing a area efficient and low error Discrete Cosine Transform. This area
efficient and low error DCT is obtained by using shifters and adders in place of multipliers. The main technique used here is
CSD (Canonical Sign Digit) technique.CSD technique efficiently reduces redundant bits. Pipelining technique is also
introduced here which reduces the processing time.

I.

INTRODUCTION

Multimedia data processing, which encompasses almost every aspects of our daily life such as communication broad casting,
data search, advertisement, video games, etc has become an integral part of our life style. The most significant part of
multimedia systems is application involving image or video, which require computationally intensive data processing.
Moreover, as the use of mobile device increases exponentially, there is a growing demand for multimedia application to run
on these portable devices. A typical image/video transmission system is shown in the Figure 1.

Figure1. Image/Video Transmission System
In order to reduce the volume of multimedia data over wireless channel compression techniques are widely used.
Efficiency of a transformation scheme can be directly gauged by its ability to pack input data into as few coefficients as
possible. This allows the quantizer to discard coefficients with relatively small amplitudes without introducing visual
distortion in the reconstructed image.
In image compression, the image data is divided up into 8x8 blocks of pixels. (From this point on, each color
component is processed independently, so a "pixel" means a single value, even in a color image.) A DCT is applied to each
8x8 block. DCT converts the spatial image representation into a frequency map: the low-order or "DC" term represents the
average value in the block, while successive higher-order ("AC") terms represent the strength of more and more rapid
changes across the width or height of the block. The highest AC term represents the strength of a cosine wave alternating
from maximum to minimum at adjacent pixels.

II.

JPEG ENCODER

The JPEG encoder is a major component in JPEG standard which is used in image compression. It involves a
complex sub-block Discrete Cosine Transform (DCT), along with other quantization, zigzag and Entropy coding blocks. In
this architecture, 2-D DCT is computed by combining two 1-D DCT that connected by a transpose buffer. The architecture
uses 4059 slices, 6885 LUT, 58 I/Os of Xilinx Spartan-3 XC3S1500 FPGA and works at an operating frequency of 65.55
MHz.
Vector processing using parallel multipliers is a method used for implementation of DCT. The advantages in the
vector processing method are regular structure, simple control and interconnect, and good balance between performance and
complexity of implementation. For the case of 8 x 8 block region, a onedimensional 8- point DCT followed by an internal transpose memory, followed by another one dimensional 8-point DCT
provides the 2D DCT architecture. Here the real time DCT coefficients are used for matrix multiplication. Using vector
processing, the output Y of an 8 x 8 DCT for input X is given by Y = C*X* CT, where C is the cosine coefficients and CT are
the transpose coefficients. This equation can also be written as Y=C * Z, where Z = X*C T.
The calculation is implemented by using eight multipliers and storing the co-efficients in ROMs. At the first clock, the eight
inputs xOO to x07 are multiplied by the eight values in column one, resulting in eight products (POO _0 to POO 7). At the
eighth clock, the eight inputs are multiplied by the eight values in column eight resulting in eight 586 products (P07 0 to P07
7). From the equations for Z, the intermediate values for the first row of Z are computed. The values for ZO (0 -7) can be
www.ijmer.com

3164 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645
calculated in eight clock cycles. All 64 values of Z are calculated in 64 clock cycles and then the process is repeated. The
values of Z correspond to the 1-DDCT of the input X. Once the Z values are calculated, the 2D-DCT function Y can be
calculated from Y = C*Z.

Figure 2. 2-D DCT Architecture
The maximum clock frequency is 65. 55 MHz when implemented with a Spartan FPGA XCS1500 device.

III.

MULTIPLIER-LESS 2-D DCT/IDCT ARCHITECTURE FOR JPEG

The Discrete Cosine transform is widely used as the core of digital image compression. Discrete Cosine Transform
attempts to de-correlate the image data. After de-correlation, each transform co-efficient can be encoded independently
without losing compression efficiency. In this paper we present VLSI Implementation of fully pipelined multiplier less
architecture of 8x8 2D DCT/IDCT. This architecture is used as the core of JPEG compression hardware. The 2-D DCT
calculation is made using the 2-D DCT separability property, such that the whole architecture is divided into two 1-D DCT
calculations by using a transpose buffer. The architecture described to implement 1-D DCT is based on Binary-Lifted DCT.
The 2-D DCT architecture achieves an operating frequency of 166 MHz. One input block of 8x8 samples each of 8 bits each
is processed in 0.198μs. The pipeline latency of proposed architecture is 45 Clock cycles.

Figure 3 Architecture of 2-D DCT
Figure 2. shows the architecture of 2-D DCT. 2D-DCT/IDCT design is divided into three major blocks namely
Row-DCT, Transpose Buffer, and Column-DCT. Row-DCT and Column-DCT contains both 1DDCT (Figure 4) by Row .
Access of Transpose buffer DCT and Column DCT is efficiently managed to achieve the performance and minimal usage of
resources in terms of area.

Figure 4. Architecture of 1-D DCT
During Forward transform, 1D-DCT structure (Figure 4) is functionally active. Row-DCT block receives two 8-bit
samples as an input in every cycle. Each sample is a signed 8- bit value and hence its value ranges from -128 to 127. The bit
width of the transformed sample is maintained as 10-bit to accommodate 2-bit increment.

Figure 5. Four stage pipeline 1D-DCT.
1D-DCT computation architecture (Figure 4 ) has a four stage internal pipeline shown in
Figure 5.Transpose Buffer receives two 10-bit samples as an input every cycle. Each sample is a signed 10-bit value
and hence its value ranges from -512 to 511. Since there is no data Manipulation in the module the output sample width
remains as input sample width i.e. 10-bit. Transpose buffer has sixty-four 10-bit registers to store one 8X8 block 1D-DCT
www.ijmer.com

3165 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645
samples. Transpose operation on 8X8 block data is performed by writing the transformed samples in row-wise and reading
them in column-wise and vice versa. Transpose Buffer block guarantees that the nth 8X8 block data will be written into the
registers after (n-1) th 8X8 block data has been completely read out for further processing. The latency of the block is 31
cycles since the data can be read only after full 8X8 block is written in the registers. Column DCT block receives two 10-bit
samples as an input in every cycle. Input samples are internally buffered.
The 2D-DCT/IDCT architecture efficiently operates up to 166MHz. Pipeline latency for the initial 8x8 block with each
element of 8 bits is 45 clock cycles which is due to 7 cycles at Row-DCT, 31 cycles for Row-DCT operation to complete, 7
cycles at Column-DCT. Effectively to perform complete 2D DCT on one 8x8 will take 33 Clock cycles on availability of
continuous input data to process. For operating frequency of 166 MHz, the processing time of 8x8 blocks is 0.198μs.
IV. QUANTIZATION ARCHITECTURE
Two dimensional DCT takes important role in JPEG image compression. Architecture and VHDL design of 2-D
DCT, combined with quantization and zig-zag arrangement, is described. The architecture is used in JPEG image
compression. The output of DCT module needs to be multiplied with post-scaler value to get the real DCT coefficients. Postscaling process is done together with quantization process. 2-D DCT is computed by combining two 1-D DCT that
connected by a transpose buffer. This design aimed to be implemented in cheap Spartan 3E XC3S500 FPGA. The 2-D DCT
architecture uses 3174 gates, 1145 Slices and 11 multipliers of one Xilinx Spartan-3E XC3S500E FPGA and reaches an
operating frequency of 84.81 MHz. One input block with
8 x 8 elements of 8 bits each is processed in 2470 ns and pipeline latency is 123 clock cycles.
This architecture adopts the scaled 1-D DCT algorithm. It means the DCT coefficients produced by the algorithm
are not the real coefficients. To get the real coefficients, the scaled ones must be multiplied with post-scaler value. Equation
(1) is showing scaled 1-D DCT process.
(1)
y' = C x
Variable x is 8 point vector. C is Arai's DCT matrix and y' is vector of scaled DCT coefficients. To get the real DCT
coefficients, y' must be element by element multiplicated with post-scaling factor. It is shown in (2).
(2)
y = s .* y'
Constant s is vector of post-scaling factor. Figure 6 shows the block diagram representation of the entire system.

Figure 6. Block Diagram of entire system
The DCT input/output has 8 points and data has to be entered and released in sequential manner, it takes 8 clock
cycles for each input and output process. Totally, 8 points 1D-DCT computation needs 22 clock cycles. 1D-DCT
computation is done in pipeline process. Figure 7 shows 1-D DCT architecture with pipelining.

Figure 7. 1-D DCT with pipelining
The 2-D DCT architecture was described in VHDL. This VHDL was synthesized into an Xilinx Spartan 3E family
FPGA. According to synthesis result, maximum time delay produced is 5.895 ns. That constraint yields minimum clock
period 11.790 ns. Maximum clock frequency can be used is 84.818 MHz. The 2-D DCT architecture uses 3174 gates, 1145
Slices and 11 multipliers.
The stronger operator, multiplication is transformed to simple shift and adds operations by applying
Horner’s rule. This reduces the power consumption. For example, consider the cosine coefficients c and g, c *X
www.ijmer.com

3166 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645
= 25 + 24 + 22 +1 (X) = (24 (3) + 5) (X) and g*X = 23 + 22 (X)= 22(3) (X) and the common terms they share is 3X. The
common terms among the cosine basis are the partial outputs. These blocks are termed as precomputing units and an unit is
shown in the Figure 3. The intermediate results from the precomputing blocks are added in the final stage yielding the DCT
coefficients. The 3A is constructed by expressing it as 3A = 1A+2A = {1A + (1A<<1)}. Similarly the 5A can be expressed
as {1A + (1A<<2)}.
A.

Low Power Odd And Even Architecture
The stages involved in the determination of 1-D DCT coefficients are precomputing, odd DCT and even DCT. The
proposed architecture is a multiplier less architecture. The representation of cosine basis in CSD format reduces power
consumption as it has less number of ones. Each multiplication of the cosine coefficients with the sequence are expressed
using Horner’s rule [6].
The sub structure sharing is carefully done so that it contains the actual sequence i.e. 1A prevalently. This keeps the
result of the shift and adds nearly matching the direct multiplication of the cosine basis with the sequence. The Odd DCT
precomputing units have 1A, -1A, 3A, and 5A structures to be shared. 3A can be expressed as (1A+2 1A) and 21A is obtained
by left shifting A by 1 place to the left. Similarly 5A=1A+(A<<2).
The even DCT contains the DC part of the transformed image i.e. in frequency domain. Hence care should be taken
in calculating the Z0 value. The corresponding cosine basis is d and is represented as d = 2 5 + 23 + 22 +1. The precomputing
unit of the even DCT consists of 1A, -1A and 3A only.
The odd and even modules use four precomputing instances each. The outputs from the precomputing units are fed
to the final adder stage. The final adder outputs are the desired DCT coefficients. The image pixel value is represented in 8bits and hence the output of the adder and subtractor has 9-bits.The input to the row DCT has 9-bits and its output is chosen
to have 12 bits. Thus the column DCT is fed by 12-bit inputs. The DCT coefficients are quantized to have 12-bits.
B. Synthesis Results
The 2-D DCT architecture is implemented with Verilog HDL and simulated using MODELSIM for functional verification.
Finally it is synthesized using QUARTUS II tool and mapped on to a Cyclone III device (65 nm). The simulated results are
shown below. The DCT coefficients obtained through this architecture is verified against MATLAB results.

A low complexity 2-D DCT architecture has been designed. The sub structure sharing removes the redundancy in
the DCT calculation. The architecture can be pipelined to meet high performance needs such as applications in HD
televisions, satellite imaging systems.
V.
IMPLEMENTATION OF THE PROPOSED APPROACH
Raw image data used in applications such as high definition television, video conferencing, computer
communication, etc. require large storage and high speed channels for handling huge volumes of image data. In order to
reduce the storage and communication channel bandwidth requirements to manageable levels, data compression techniques
are inevitable. To meet the timing requirements for high speed applications the compression has to be achieved at high
speeds.
www.ijmer.com

3167 | Page
International Journal of Modern Engineering Research (IJMER)
www.ijmer.com
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645
A. Pipelining
The speed of the system depends on the time taken by the longest data path which is known as the critical path. To
increase the speed of the architecture, pipelining technique can be used. Pipelining is the technique of introducing latches
along the data path so as to reduce the critical path. Reduction in the critical path increases the speed of operation. Pipelining
latches can only be placed across any feed-forward cutest of the architecture graph. The clock period is then limited and the
critical path may be between i.An input and a latch, ii.A latch and an output, iii. Two Latches an input and an output.
In this architecture, the pipelining is of fine grain type as the pipeline latches are introduced inside the 1-D module.
The critical path of the 1-D module is from the input to the output. This includes the computational time of precomputing
modules and the final adders as well. The critical path time is from the onset of the input xi+xj and the arrival of zk
(i=0,1,,2,3;j=4,5,,6,7 and k=0 to 7). The speed can be improved if latches are introduced between the shifter and adder
module of precomputing modules. After pipelining the critical path is from the input(X) to 3X generator.
The architecture is synthesized into Stratix family EP1S10 device (for high performance needs). The results are shown in the
Table1

Table 1. Frequency summary
B. Device Utilization Summary (Non Pipelined 2-D DCT)

C. Image Analysis Results

Figure 8. Original Image

Figure 9. Reconstructed From DCT Coefficients

www.ijmer.com

3168 | Page
www.ijmer.com

International Journal of Modern Engineering Research (IJMER)
Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169
ISSN: 2249-6645

VI.

CONCLUSION

The 2-D DCT and 1D-DCT architectures which adopts algorithmic strength reduction technique to reduce the
device utilization pulling the power consumption low have thus been designed. The DCT computation is performed with
sufficiently high precision yielding an acceptable quality. The pipelined 2-D DCT architecture achieves a maximum
operating frequency of 185.95 MHz. The first eight 2-D coefficients arrived at the nineteenth clock cycle and for the full 64
coefficients, it took about 26 clock cycles to compute. The future work can be oriented towards developing an encoder by
architecting a quantizer, based on the strength reduction technique and an entropy encoder.

REFERENCES
[1].
[2].
[3].
[4].
[5].

[6].
[7].

Enas Dhuhri Kusuma, Thomas Sri Widodo “FPGA Implementation of Pipelined 2D- DCT and Quantization
Architecture for JPEG Image Compression” Proceedings of 978-1-4244-6716-7/10/$26.00 ©2010 IEEE
Vijay Kumar Sharma, K. K. Mahapatra and Umesh C. Pati “An Efficient Distributed Arithmetic based VLSI
Architecture for DCT”
Chidanandan, Bayoumi, M, “Area-Efficient NEDA Architecture for The 1-D DCT/IDCT” ICASSP 2006 Proceedings.
2006 IEEE International Conference.
Zhenyu Liu Tughrul Arslan Ahmet T. Erdogan “A Novel Reconfigurable Low Power Distributed Arithmetic
Architecture for Multimedia Applications” Proceedings of 1-4244-0630-7/07/$20.00 C 2007 IEEE.
S.Saravanan, Dr.Vidyacharan Bhaskar, P.T.Lee ,“A High Performance Parallel Distributed Arithmetic DCT
Architecture for H.264 Video Compression” European Journal of Scientific Research ISSN 1450- 216X Vol.42 No.4
(2010), pp.558-564 EuroJournals Publishing, Inc. 2010
VLSI Digital Signal Processing Systems: Design And Implementation (Hardcover) By Keshab K Parhi
Vijaya Prakash.A.M, K.S.Gurumurthy “A Novel VLSI Architecture for Digital Image Compression Using Discrete
Cosine Transform and Quantization”, IJCSNS International Journal of Computer Science and Network Security,
VOL.10 No.9, September 2010
Santosh Sanjeevannanavar, Nagamani A.N , “Efficient Design and FPGA Implementation of JPEG Encoder using
Verilog HDL” 978-1-4673-0074- 2/11/$26.00 @2011 IEEE 584

www.ijmer.com

3169 | Page

More Related Content

What's hot

Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd Iaetsd
 
Efficient Design of Reversible Multiplexers with Low Quantum Cost
Efficient Design of Reversible Multiplexers with Low Quantum CostEfficient Design of Reversible Multiplexers with Low Quantum Cost
Efficient Design of Reversible Multiplexers with Low Quantum CostIJERA Editor
 
Fpga based low power and high performance address
Fpga based low power and high performance addressFpga based low power and high performance address
Fpga based low power and high performance addresseSAT Publishing House
 
Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...eSAT Journals
 
IRJET-ASIC Implementation for SOBEL Accelerator
IRJET-ASIC Implementation for SOBEL AcceleratorIRJET-ASIC Implementation for SOBEL Accelerator
IRJET-ASIC Implementation for SOBEL AcceleratorIRJET Journal
 
Iaetsd finger print recognition by cordic algorithm and pipelined fft
Iaetsd finger print recognition by cordic algorithm and pipelined fftIaetsd finger print recognition by cordic algorithm and pipelined fft
Iaetsd finger print recognition by cordic algorithm and pipelined fftIaetsd Iaetsd
 
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gates
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic GatesA Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gates
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gatesijsrd.com
 
Iaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applicationsIaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applicationsIaetsd Iaetsd
 
34 8951 suseela g suseela paper8 (edit)new
34 8951 suseela g   suseela paper8 (edit)new34 8951 suseela g   suseela paper8 (edit)new
34 8951 suseela g suseela paper8 (edit)newIAESIJEECS
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2IAESIJEECS
 
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI ArchitectureA Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI ArchitectureIRJET Journal
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 

What's hot (16)

Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
 
Efficient Design of Reversible Multiplexers with Low Quantum Cost
Efficient Design of Reversible Multiplexers with Low Quantum CostEfficient Design of Reversible Multiplexers with Low Quantum Cost
Efficient Design of Reversible Multiplexers with Low Quantum Cost
 
Fpga based low power and high performance address
Fpga based low power and high performance addressFpga based low power and high performance address
Fpga based low power and high performance address
 
Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...
 
IRJET-ASIC Implementation for SOBEL Accelerator
IRJET-ASIC Implementation for SOBEL AcceleratorIRJET-ASIC Implementation for SOBEL Accelerator
IRJET-ASIC Implementation for SOBEL Accelerator
 
Iaetsd finger print recognition by cordic algorithm and pipelined fft
Iaetsd finger print recognition by cordic algorithm and pipelined fftIaetsd finger print recognition by cordic algorithm and pipelined fft
Iaetsd finger print recognition by cordic algorithm and pipelined fft
 
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gates
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic GatesA Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gates
A Novel Design of 4 Bit Johnson Counter Using Reversible Logic Gates
 
Iaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applicationsIaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applications
 
34 8951 suseela g suseela paper8 (edit)new
34 8951 suseela g   suseela paper8 (edit)new34 8951 suseela g   suseela paper8 (edit)new
34 8951 suseela g suseela paper8 (edit)new
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2
 
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI ArchitectureA Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
 
Aw4102359364
Aw4102359364Aw4102359364
Aw4102359364
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Gn3311521155
Gn3311521155Gn3311521155
Gn3311521155
 
Fx3111501156
Fx3111501156Fx3111501156
Fx3111501156
 
Hz2514321439
Hz2514321439Hz2514321439
Hz2514321439
 

Viewers also liked

Discrete cosine transform
Discrete cosine transform   Discrete cosine transform
Discrete cosine transform Rashmi Karkra
 
Discrete cosine transform
Discrete cosine transformDiscrete cosine transform
Discrete cosine transformaniruddh Tyagi
 
C0502 01 1116
C0502 01 1116C0502 01 1116
C0502 01 1116IJMER
 
International Journal of Modern Engineering Research (IJMER)
International Journal of Modern Engineering Research (IJMER)International Journal of Modern Engineering Research (IJMER)
International Journal of Modern Engineering Research (IJMER)IJMER
 
How To Talk To Anyone - Book Review
How To Talk To Anyone - Book ReviewHow To Talk To Anyone - Book Review
How To Talk To Anyone - Book ReviewIES MCRC
 
A Total Ergonomics Model for integration of health, safety and work to improv...
A Total Ergonomics Model for integration of health, safety and work to improv...A Total Ergonomics Model for integration of health, safety and work to improv...
A Total Ergonomics Model for integration of health, safety and work to improv...IJMER
 
Trough External Service Management Improve Quality & Productivity
Trough External Service Management Improve Quality & ProductivityTrough External Service Management Improve Quality & Productivity
Trough External Service Management Improve Quality & ProductivityIJMER
 
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...IJMER
 
Ar32704708
Ar32704708Ar32704708
Ar32704708IJMER
 
Implementation of Monitoring System for Cloud Computing
Implementation of Monitoring System for Cloud ComputingImplementation of Monitoring System for Cloud Computing
Implementation of Monitoring System for Cloud ComputingIJMER
 
The Queue M/M/1 with Additional Servers for a Longer Queue
The Queue M/M/1 with Additional Servers for a Longer QueueThe Queue M/M/1 with Additional Servers for a Longer Queue
The Queue M/M/1 with Additional Servers for a Longer QueueIJMER
 
Optimization of Tool Wear: A Review
Optimization of Tool Wear: A ReviewOptimization of Tool Wear: A Review
Optimization of Tool Wear: A ReviewIJMER
 
A Review of FDM Based Parts to Act as Rapid Tooling
A Review of FDM Based Parts to Act as Rapid ToolingA Review of FDM Based Parts to Act as Rapid Tooling
A Review of FDM Based Parts to Act as Rapid ToolingIJMER
 
Develop with love bb10
Develop with love bb10Develop with love bb10
Develop with love bb10Bhasker Thapan
 
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor Networks
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor NetworksSecure and Efficient Hierarchical Data Aggregation in Wireless Sensor Networks
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor NetworksIJMER
 
An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...IJMER
 
Intranetizen IIC12: how to ensure project faillure
Intranetizen IIC12: how to ensure project faillureIntranetizen IIC12: how to ensure project faillure
Intranetizen IIC12: how to ensure project faillureIntranetizen
 
FPGA Based Wireless Jamming Networks
FPGA Based Wireless Jamming NetworksFPGA Based Wireless Jamming Networks
FPGA Based Wireless Jamming NetworksIJMER
 
Comparative study on Garments dyeing process and Fabric dyeing process on var...
Comparative study on Garments dyeing process and Fabric dyeing process on var...Comparative study on Garments dyeing process and Fabric dyeing process on var...
Comparative study on Garments dyeing process and Fabric dyeing process on var...IJMER
 

Viewers also liked (20)

Compression
CompressionCompression
Compression
 
Discrete cosine transform
Discrete cosine transform   Discrete cosine transform
Discrete cosine transform
 
Discrete cosine transform
Discrete cosine transformDiscrete cosine transform
Discrete cosine transform
 
C0502 01 1116
C0502 01 1116C0502 01 1116
C0502 01 1116
 
International Journal of Modern Engineering Research (IJMER)
International Journal of Modern Engineering Research (IJMER)International Journal of Modern Engineering Research (IJMER)
International Journal of Modern Engineering Research (IJMER)
 
How To Talk To Anyone - Book Review
How To Talk To Anyone - Book ReviewHow To Talk To Anyone - Book Review
How To Talk To Anyone - Book Review
 
A Total Ergonomics Model for integration of health, safety and work to improv...
A Total Ergonomics Model for integration of health, safety and work to improv...A Total Ergonomics Model for integration of health, safety and work to improv...
A Total Ergonomics Model for integration of health, safety and work to improv...
 
Trough External Service Management Improve Quality & Productivity
Trough External Service Management Improve Quality & ProductivityTrough External Service Management Improve Quality & Productivity
Trough External Service Management Improve Quality & Productivity
 
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...
Scope of Improving Energy Utilization in Coal Based Co-Generation on Thermal ...
 
Ar32704708
Ar32704708Ar32704708
Ar32704708
 
Implementation of Monitoring System for Cloud Computing
Implementation of Monitoring System for Cloud ComputingImplementation of Monitoring System for Cloud Computing
Implementation of Monitoring System for Cloud Computing
 
The Queue M/M/1 with Additional Servers for a Longer Queue
The Queue M/M/1 with Additional Servers for a Longer QueueThe Queue M/M/1 with Additional Servers for a Longer Queue
The Queue M/M/1 with Additional Servers for a Longer Queue
 
Optimization of Tool Wear: A Review
Optimization of Tool Wear: A ReviewOptimization of Tool Wear: A Review
Optimization of Tool Wear: A Review
 
A Review of FDM Based Parts to Act as Rapid Tooling
A Review of FDM Based Parts to Act as Rapid ToolingA Review of FDM Based Parts to Act as Rapid Tooling
A Review of FDM Based Parts to Act as Rapid Tooling
 
Develop with love bb10
Develop with love bb10Develop with love bb10
Develop with love bb10
 
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor Networks
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor NetworksSecure and Efficient Hierarchical Data Aggregation in Wireless Sensor Networks
Secure and Efficient Hierarchical Data Aggregation in Wireless Sensor Networks
 
An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...
 
Intranetizen IIC12: how to ensure project faillure
Intranetizen IIC12: how to ensure project faillureIntranetizen IIC12: how to ensure project faillure
Intranetizen IIC12: how to ensure project faillure
 
FPGA Based Wireless Jamming Networks
FPGA Based Wireless Jamming NetworksFPGA Based Wireless Jamming Networks
FPGA Based Wireless Jamming Networks
 
Comparative study on Garments dyeing process and Fabric dyeing process on var...
Comparative study on Garments dyeing process and Fabric dyeing process on var...Comparative study on Garments dyeing process and Fabric dyeing process on var...
Comparative study on Garments dyeing process and Fabric dyeing process on var...
 

Similar to Efficient Implementation of Low Power 2-D DCT Architecture

Modified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transformModified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transformIJERA Editor
 
ASIC Implementation for SOBEL Accelerator
ASIC Implementation for SOBEL AcceleratorASIC Implementation for SOBEL Accelerator
ASIC Implementation for SOBEL AcceleratorIRJET Journal
 
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGAIIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGAIRJET Journal
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...VLSICS Design
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...VLSICS Design
 
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORM
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORMLOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORM
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORMVLSICS Design
 
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNSLOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNScscpconf
 
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNSLOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNScsandit
 
Low power area gdi & ptl techniques based full adder designs
Low power area gdi & ptl techniques based full adder designsLow power area gdi & ptl techniques based full adder designs
Low power area gdi & ptl techniques based full adder designscsandit
 
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSI
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSIIRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSI
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSIIRJET Journal
 
Implementation of Area & Power Optimized VLSI Circuits Using Logic Techniques
Implementation of Area & Power Optimized VLSI Circuits Using Logic TechniquesImplementation of Area & Power Optimized VLSI Circuits Using Logic Techniques
Implementation of Area & Power Optimized VLSI Circuits Using Logic TechniquesIOSRJVSP
 
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...Associate Professor in VSB Coimbatore
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm forBivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm foreSAT Publishing House
 
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...IJERA Editor
 
Kassem2009
Kassem2009Kassem2009
Kassem2009lazchi
 
Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...IAEME Publication
 
Comparative Analysis of Different Types of Full Adder Circuits
Comparative Analysis of Different Types of Full Adder CircuitsComparative Analysis of Different Types of Full Adder Circuits
Comparative Analysis of Different Types of Full Adder CircuitsIOSR Journals
 
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyDesign Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyIJEEE
 

Similar to Efficient Implementation of Low Power 2-D DCT Architecture (20)

Hv2514131415
Hv2514131415Hv2514131415
Hv2514131415
 
An35225228
An35225228An35225228
An35225228
 
Modified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transformModified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transform
 
ASIC Implementation for SOBEL Accelerator
ASIC Implementation for SOBEL AcceleratorASIC Implementation for SOBEL Accelerator
ASIC Implementation for SOBEL Accelerator
 
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGAIIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGA
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
 
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORM
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORMLOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORM
LOW POWER-AREA DESIGNS OF 1BIT FULL ADDER IN CADENCE VIRTUOSO PLATFORM
 
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNSLOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
 
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNSLOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
 
Low power area gdi & ptl techniques based full adder designs
Low power area gdi & ptl techniques based full adder designsLow power area gdi & ptl techniques based full adder designs
Low power area gdi & ptl techniques based full adder designs
 
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSI
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSIIRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSI
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSI
 
Implementation of Area & Power Optimized VLSI Circuits Using Logic Techniques
Implementation of Area & Power Optimized VLSI Circuits Using Logic TechniquesImplementation of Area & Power Optimized VLSI Circuits Using Logic Techniques
Implementation of Area & Power Optimized VLSI Circuits Using Logic Techniques
 
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...
Design of Power Efficient 4x4 Multiplier Based On Various Power Optimizing Te...
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm forBivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm for
 
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...
Efficient Design of Ripple Carry Adder and Carry Skip Adder with Low Quantum ...
 
Kassem2009
Kassem2009Kassem2009
Kassem2009
 
Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...Performance boosting of discrete cosine transform using parallel programming ...
Performance boosting of discrete cosine transform using parallel programming ...
 
Comparative Analysis of Different Types of Full Adder Circuits
Comparative Analysis of Different Types of Full Adder CircuitsComparative Analysis of Different Types of Full Adder Circuits
Comparative Analysis of Different Types of Full Adder Circuits
 
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyDesign Analysis of Delay Register with PTL Logic using 90 nm Technology
Design Analysis of Delay Register with PTL Logic using 90 nm Technology
 

More from IJMER

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...IJMER
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingIJMER
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreIJMER
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)IJMER
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...IJMER
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...IJMER
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...IJMER
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...IJMER
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationIJMER
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless BicycleIJMER
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIJMER
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemIJMER
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesIJMER
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...IJMER
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningIJMER
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessIJMER
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersIJMER
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And AuditIJMER
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLIJMER
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyIJMER
 

More from IJMER (20)

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed Delinting
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja Fibre
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless Bicycle
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise Applications
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation System
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological Spaces
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine Learning
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And Audit
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDL
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One Prey
 

Recently uploaded

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Recently uploaded (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Efficient Implementation of Low Power 2-D DCT Architecture

  • 1. www.ijmer.com International Journal of Modern Engineering Research (IJMER) Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 Efficient Implementation of Low Power 2-D DCT Architecture 1 Kalyan Chakravarthy. K , 2G.V.K.S.Prasad 1 2 M.Tech student, ECE, AKRG College of Engineering and Technology, Nallajerla, INDIA Assistant professor, ECE, AKRG College of Engineering and Technology, Nallajerla, INDIA ABSTRACT: This Research paper includes designing a area efficient and low error Discrete Cosine Transform. This area efficient and low error DCT is obtained by using shifters and adders in place of multipliers. The main technique used here is CSD (Canonical Sign Digit) technique.CSD technique efficiently reduces redundant bits. Pipelining technique is also introduced here which reduces the processing time. I. INTRODUCTION Multimedia data processing, which encompasses almost every aspects of our daily life such as communication broad casting, data search, advertisement, video games, etc has become an integral part of our life style. The most significant part of multimedia systems is application involving image or video, which require computationally intensive data processing. Moreover, as the use of mobile device increases exponentially, there is a growing demand for multimedia application to run on these portable devices. A typical image/video transmission system is shown in the Figure 1. Figure1. Image/Video Transmission System In order to reduce the volume of multimedia data over wireless channel compression techniques are widely used. Efficiency of a transformation scheme can be directly gauged by its ability to pack input data into as few coefficients as possible. This allows the quantizer to discard coefficients with relatively small amplitudes without introducing visual distortion in the reconstructed image. In image compression, the image data is divided up into 8x8 blocks of pixels. (From this point on, each color component is processed independently, so a "pixel" means a single value, even in a color image.) A DCT is applied to each 8x8 block. DCT converts the spatial image representation into a frequency map: the low-order or "DC" term represents the average value in the block, while successive higher-order ("AC") terms represent the strength of more and more rapid changes across the width or height of the block. The highest AC term represents the strength of a cosine wave alternating from maximum to minimum at adjacent pixels. II. JPEG ENCODER The JPEG encoder is a major component in JPEG standard which is used in image compression. It involves a complex sub-block Discrete Cosine Transform (DCT), along with other quantization, zigzag and Entropy coding blocks. In this architecture, 2-D DCT is computed by combining two 1-D DCT that connected by a transpose buffer. The architecture uses 4059 slices, 6885 LUT, 58 I/Os of Xilinx Spartan-3 XC3S1500 FPGA and works at an operating frequency of 65.55 MHz. Vector processing using parallel multipliers is a method used for implementation of DCT. The advantages in the vector processing method are regular structure, simple control and interconnect, and good balance between performance and complexity of implementation. For the case of 8 x 8 block region, a onedimensional 8- point DCT followed by an internal transpose memory, followed by another one dimensional 8-point DCT provides the 2D DCT architecture. Here the real time DCT coefficients are used for matrix multiplication. Using vector processing, the output Y of an 8 x 8 DCT for input X is given by Y = C*X* CT, where C is the cosine coefficients and CT are the transpose coefficients. This equation can also be written as Y=C * Z, where Z = X*C T. The calculation is implemented by using eight multipliers and storing the co-efficients in ROMs. At the first clock, the eight inputs xOO to x07 are multiplied by the eight values in column one, resulting in eight products (POO _0 to POO 7). At the eighth clock, the eight inputs are multiplied by the eight values in column eight resulting in eight 586 products (P07 0 to P07 7). From the equations for Z, the intermediate values for the first row of Z are computed. The values for ZO (0 -7) can be www.ijmer.com 3164 | Page
  • 2. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 calculated in eight clock cycles. All 64 values of Z are calculated in 64 clock cycles and then the process is repeated. The values of Z correspond to the 1-DDCT of the input X. Once the Z values are calculated, the 2D-DCT function Y can be calculated from Y = C*Z. Figure 2. 2-D DCT Architecture The maximum clock frequency is 65. 55 MHz when implemented with a Spartan FPGA XCS1500 device. III. MULTIPLIER-LESS 2-D DCT/IDCT ARCHITECTURE FOR JPEG The Discrete Cosine transform is widely used as the core of digital image compression. Discrete Cosine Transform attempts to de-correlate the image data. After de-correlation, each transform co-efficient can be encoded independently without losing compression efficiency. In this paper we present VLSI Implementation of fully pipelined multiplier less architecture of 8x8 2D DCT/IDCT. This architecture is used as the core of JPEG compression hardware. The 2-D DCT calculation is made using the 2-D DCT separability property, such that the whole architecture is divided into two 1-D DCT calculations by using a transpose buffer. The architecture described to implement 1-D DCT is based on Binary-Lifted DCT. The 2-D DCT architecture achieves an operating frequency of 166 MHz. One input block of 8x8 samples each of 8 bits each is processed in 0.198μs. The pipeline latency of proposed architecture is 45 Clock cycles. Figure 3 Architecture of 2-D DCT Figure 2. shows the architecture of 2-D DCT. 2D-DCT/IDCT design is divided into three major blocks namely Row-DCT, Transpose Buffer, and Column-DCT. Row-DCT and Column-DCT contains both 1DDCT (Figure 4) by Row . Access of Transpose buffer DCT and Column DCT is efficiently managed to achieve the performance and minimal usage of resources in terms of area. Figure 4. Architecture of 1-D DCT During Forward transform, 1D-DCT structure (Figure 4) is functionally active. Row-DCT block receives two 8-bit samples as an input in every cycle. Each sample is a signed 8- bit value and hence its value ranges from -128 to 127. The bit width of the transformed sample is maintained as 10-bit to accommodate 2-bit increment. Figure 5. Four stage pipeline 1D-DCT. 1D-DCT computation architecture (Figure 4 ) has a four stage internal pipeline shown in Figure 5.Transpose Buffer receives two 10-bit samples as an input every cycle. Each sample is a signed 10-bit value and hence its value ranges from -512 to 511. Since there is no data Manipulation in the module the output sample width remains as input sample width i.e. 10-bit. Transpose buffer has sixty-four 10-bit registers to store one 8X8 block 1D-DCT www.ijmer.com 3165 | Page
  • 3. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 samples. Transpose operation on 8X8 block data is performed by writing the transformed samples in row-wise and reading them in column-wise and vice versa. Transpose Buffer block guarantees that the nth 8X8 block data will be written into the registers after (n-1) th 8X8 block data has been completely read out for further processing. The latency of the block is 31 cycles since the data can be read only after full 8X8 block is written in the registers. Column DCT block receives two 10-bit samples as an input in every cycle. Input samples are internally buffered. The 2D-DCT/IDCT architecture efficiently operates up to 166MHz. Pipeline latency for the initial 8x8 block with each element of 8 bits is 45 clock cycles which is due to 7 cycles at Row-DCT, 31 cycles for Row-DCT operation to complete, 7 cycles at Column-DCT. Effectively to perform complete 2D DCT on one 8x8 will take 33 Clock cycles on availability of continuous input data to process. For operating frequency of 166 MHz, the processing time of 8x8 blocks is 0.198μs. IV. QUANTIZATION ARCHITECTURE Two dimensional DCT takes important role in JPEG image compression. Architecture and VHDL design of 2-D DCT, combined with quantization and zig-zag arrangement, is described. The architecture is used in JPEG image compression. The output of DCT module needs to be multiplied with post-scaler value to get the real DCT coefficients. Postscaling process is done together with quantization process. 2-D DCT is computed by combining two 1-D DCT that connected by a transpose buffer. This design aimed to be implemented in cheap Spartan 3E XC3S500 FPGA. The 2-D DCT architecture uses 3174 gates, 1145 Slices and 11 multipliers of one Xilinx Spartan-3E XC3S500E FPGA and reaches an operating frequency of 84.81 MHz. One input block with 8 x 8 elements of 8 bits each is processed in 2470 ns and pipeline latency is 123 clock cycles. This architecture adopts the scaled 1-D DCT algorithm. It means the DCT coefficients produced by the algorithm are not the real coefficients. To get the real coefficients, the scaled ones must be multiplied with post-scaler value. Equation (1) is showing scaled 1-D DCT process. (1) y' = C x Variable x is 8 point vector. C is Arai's DCT matrix and y' is vector of scaled DCT coefficients. To get the real DCT coefficients, y' must be element by element multiplicated with post-scaling factor. It is shown in (2). (2) y = s .* y' Constant s is vector of post-scaling factor. Figure 6 shows the block diagram representation of the entire system. Figure 6. Block Diagram of entire system The DCT input/output has 8 points and data has to be entered and released in sequential manner, it takes 8 clock cycles for each input and output process. Totally, 8 points 1D-DCT computation needs 22 clock cycles. 1D-DCT computation is done in pipeline process. Figure 7 shows 1-D DCT architecture with pipelining. Figure 7. 1-D DCT with pipelining The 2-D DCT architecture was described in VHDL. This VHDL was synthesized into an Xilinx Spartan 3E family FPGA. According to synthesis result, maximum time delay produced is 5.895 ns. That constraint yields minimum clock period 11.790 ns. Maximum clock frequency can be used is 84.818 MHz. The 2-D DCT architecture uses 3174 gates, 1145 Slices and 11 multipliers. The stronger operator, multiplication is transformed to simple shift and adds operations by applying Horner’s rule. This reduces the power consumption. For example, consider the cosine coefficients c and g, c *X www.ijmer.com 3166 | Page
  • 4. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 = 25 + 24 + 22 +1 (X) = (24 (3) + 5) (X) and g*X = 23 + 22 (X)= 22(3) (X) and the common terms they share is 3X. The common terms among the cosine basis are the partial outputs. These blocks are termed as precomputing units and an unit is shown in the Figure 3. The intermediate results from the precomputing blocks are added in the final stage yielding the DCT coefficients. The 3A is constructed by expressing it as 3A = 1A+2A = {1A + (1A<<1)}. Similarly the 5A can be expressed as {1A + (1A<<2)}. A. Low Power Odd And Even Architecture The stages involved in the determination of 1-D DCT coefficients are precomputing, odd DCT and even DCT. The proposed architecture is a multiplier less architecture. The representation of cosine basis in CSD format reduces power consumption as it has less number of ones. Each multiplication of the cosine coefficients with the sequence are expressed using Horner’s rule [6]. The sub structure sharing is carefully done so that it contains the actual sequence i.e. 1A prevalently. This keeps the result of the shift and adds nearly matching the direct multiplication of the cosine basis with the sequence. The Odd DCT precomputing units have 1A, -1A, 3A, and 5A structures to be shared. 3A can be expressed as (1A+2 1A) and 21A is obtained by left shifting A by 1 place to the left. Similarly 5A=1A+(A<<2). The even DCT contains the DC part of the transformed image i.e. in frequency domain. Hence care should be taken in calculating the Z0 value. The corresponding cosine basis is d and is represented as d = 2 5 + 23 + 22 +1. The precomputing unit of the even DCT consists of 1A, -1A and 3A only. The odd and even modules use four precomputing instances each. The outputs from the precomputing units are fed to the final adder stage. The final adder outputs are the desired DCT coefficients. The image pixel value is represented in 8bits and hence the output of the adder and subtractor has 9-bits.The input to the row DCT has 9-bits and its output is chosen to have 12 bits. Thus the column DCT is fed by 12-bit inputs. The DCT coefficients are quantized to have 12-bits. B. Synthesis Results The 2-D DCT architecture is implemented with Verilog HDL and simulated using MODELSIM for functional verification. Finally it is synthesized using QUARTUS II tool and mapped on to a Cyclone III device (65 nm). The simulated results are shown below. The DCT coefficients obtained through this architecture is verified against MATLAB results. A low complexity 2-D DCT architecture has been designed. The sub structure sharing removes the redundancy in the DCT calculation. The architecture can be pipelined to meet high performance needs such as applications in HD televisions, satellite imaging systems. V. IMPLEMENTATION OF THE PROPOSED APPROACH Raw image data used in applications such as high definition television, video conferencing, computer communication, etc. require large storage and high speed channels for handling huge volumes of image data. In order to reduce the storage and communication channel bandwidth requirements to manageable levels, data compression techniques are inevitable. To meet the timing requirements for high speed applications the compression has to be achieved at high speeds. www.ijmer.com 3167 | Page
  • 5. International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 A. Pipelining The speed of the system depends on the time taken by the longest data path which is known as the critical path. To increase the speed of the architecture, pipelining technique can be used. Pipelining is the technique of introducing latches along the data path so as to reduce the critical path. Reduction in the critical path increases the speed of operation. Pipelining latches can only be placed across any feed-forward cutest of the architecture graph. The clock period is then limited and the critical path may be between i.An input and a latch, ii.A latch and an output, iii. Two Latches an input and an output. In this architecture, the pipelining is of fine grain type as the pipeline latches are introduced inside the 1-D module. The critical path of the 1-D module is from the input to the output. This includes the computational time of precomputing modules and the final adders as well. The critical path time is from the onset of the input xi+xj and the arrival of zk (i=0,1,,2,3;j=4,5,,6,7 and k=0 to 7). The speed can be improved if latches are introduced between the shifter and adder module of precomputing modules. After pipelining the critical path is from the input(X) to 3X generator. The architecture is synthesized into Stratix family EP1S10 device (for high performance needs). The results are shown in the Table1 Table 1. Frequency summary B. Device Utilization Summary (Non Pipelined 2-D DCT) C. Image Analysis Results Figure 8. Original Image Figure 9. Reconstructed From DCT Coefficients www.ijmer.com 3168 | Page
  • 6. www.ijmer.com International Journal of Modern Engineering Research (IJMER) Vol. 3, Issue. 5, Sep - Oct. 2013 pp-3164-3169 ISSN: 2249-6645 VI. CONCLUSION The 2-D DCT and 1D-DCT architectures which adopts algorithmic strength reduction technique to reduce the device utilization pulling the power consumption low have thus been designed. The DCT computation is performed with sufficiently high precision yielding an acceptable quality. The pipelined 2-D DCT architecture achieves a maximum operating frequency of 185.95 MHz. The first eight 2-D coefficients arrived at the nineteenth clock cycle and for the full 64 coefficients, it took about 26 clock cycles to compute. The future work can be oriented towards developing an encoder by architecting a quantizer, based on the strength reduction technique and an entropy encoder. REFERENCES [1]. [2]. [3]. [4]. [5]. [6]. [7]. Enas Dhuhri Kusuma, Thomas Sri Widodo “FPGA Implementation of Pipelined 2D- DCT and Quantization Architecture for JPEG Image Compression” Proceedings of 978-1-4244-6716-7/10/$26.00 ©2010 IEEE Vijay Kumar Sharma, K. K. Mahapatra and Umesh C. Pati “An Efficient Distributed Arithmetic based VLSI Architecture for DCT” Chidanandan, Bayoumi, M, “Area-Efficient NEDA Architecture for The 1-D DCT/IDCT” ICASSP 2006 Proceedings. 2006 IEEE International Conference. Zhenyu Liu Tughrul Arslan Ahmet T. Erdogan “A Novel Reconfigurable Low Power Distributed Arithmetic Architecture for Multimedia Applications” Proceedings of 1-4244-0630-7/07/$20.00 C 2007 IEEE. S.Saravanan, Dr.Vidyacharan Bhaskar, P.T.Lee ,“A High Performance Parallel Distributed Arithmetic DCT Architecture for H.264 Video Compression” European Journal of Scientific Research ISSN 1450- 216X Vol.42 No.4 (2010), pp.558-564 EuroJournals Publishing, Inc. 2010 VLSI Digital Signal Processing Systems: Design And Implementation (Hardcover) By Keshab K Parhi Vijaya Prakash.A.M, K.S.Gurumurthy “A Novel VLSI Architecture for Digital Image Compression Using Discrete Cosine Transform and Quantization”, IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.9, September 2010 Santosh Sanjeevannanavar, Nagamani A.N , “Efficient Design and FPGA Implementation of JPEG Encoder using Verilog HDL” 978-1-4673-0074- 2/11/$26.00 @2011 IEEE 584 www.ijmer.com 3169 | Page