This paper presents a highly parallel pipelined VLSI architecture for lifting-based 2D discrete wavelet transform (DWT), aimed at enhancing the performance speed for image processing applications. The architecture utilizes fixed-point lifting coefficients and employs techniques such as pipelining and distributed arithmetic to achieve high throughput while managing hardware costs. The design was developed using Verilog HDL and implemented on a Virtex 6 FPGA, showcasing significant optimization over conventional architectures.