Abstract The wavelet transform has become the most interesting new algorithm for video compression. Yet there are many parameters within a wavelet analysis and synthesis which govern the quality of a decoded video. In this paper different wavelet decomposition strategies and their implications for the decoded video are discussed. A pool of color video sequences has been wavelet-transformed at different settings of the wavelet filter bank and quantization threshold and with decomposition of dyadic and packet wavelet transformation strategies. The empirical evaluation of the decomposition strategy is based on three benchmarks: a first judgment regards the perceived quality of the decoded video. The compression rate is a second crucial factor, and finally the best parameter setting with regards to the Peak Signal to Noise Ratio (PSNR). The investigation proposes dyadic decomposition as the chosen decomposition strategy.
High Speed and Area Efficient 2D DWT Processor Based Image Compressionsipij
The document describes a proposed high speed and area efficient 2D discrete wavelet transform (DWT) processor design for image compression applications implemented on FPGAs. The design uses a pipelined partially serial architecture to enhance speed while optimally utilizing FPGA resources. Simulation results show the design operating at 231MHz on a Spartan 3 FPGA, a 15% improvement over alternative designs. Resource utilization and speed are improved compared to previous implementations through the optimized DWT processor architecture and FPGA platform choice.
Video Denoising using Transform Domain MethodIRJET Journal
This document presents a proposed method for video denoising using dictionary learning and transform domain techniques. It begins with an abstract describing how traditional video denoising models based on Gaussian noise do not account for real-world noise sources. The proposed method then learns basis functions adaptively from input video frames using dictionary learning, providing a sparse representation. Hard thresholding is applied in the transform domain to compute denoised frames. Experimental results on standard test videos show the method achieves competitive performance compared to other approaches in terms of peak signal-to-noise ratio.
Multiple Binary Images Watermarking in Spatial and Frequency Domainssipij
Editing, reproduction and distribution of the digital multimedia are becoming extremely easier and faster with the existence of the internet and the availability of pervasive and powerful multimedia tools. Digital watermarking has emerged as a possible method to tackle these issues. This paper proposes a scheme using which more data can be inserted into an image in different domains using different techniques. This increases the embedding capacity. Using the proposed scheme 24 binary images can be embedded in the DCT domain and 12 binary images can be embedded in the spatial domain using LSB substitution technique in a single RGB image. The proposed scheme also provides an extra level of security to the watermark image by scrambling the image before embedding it into the host image. Experimental results show that the proposed watermarking method results in almost invisible difference between the watermarked image and the original image and is also robust against various image processing attacks.
Introduction to Video Compression Techniques - Anurag JainVideoguy
The document provides an overview of video compression techniques and standards. It discusses the motivation for video compression to reduce data sizes for storage and transmission. It then reviews several key video compression standards including H.261, H.263, MPEG-1, MPEG-2, MPEG-4, H.264 and others. For each standard, it summarizes the goals, features, applications and technical details like motion compensation methods, block sizes, and bitrate ranges.
COMPARISON OF DENOISING ALGORITHMS FOR DEMOSACING LOW LIGHTING IMAGES USING C...sipij
In modern digital cameras, the Bayer color filter array (CFA) has been widely used. It is also widely known as CFA 1.0. However, Bayer pattern is inferior to the red-green-blue-white (RGBW) pattern, which is also known as CFA 2.0, in low lighting conditions in which Poisson noise is present. It is well known that demosaicing algorithms cannot effectively deal with Poisson noise and additional denoising is needed in order to improve the image quality. In this paper, we propose to evaluate various conventional and deep learning based denoising algorithms for CFA 2.0 in low lighting conditions. We will also investigate the impact of the location of denoising, which refers to whether the denoising is done before or after a critical step of demosaicing. Extensive experiments show that some denoising algorithms can indeed improve the image quality in low lighting conditions. We also noticed that the location of denoising plays an important role in the overall demosaicing performance.
1) The document proposes a new color filter array (CFA) with six colors and a 2x3 periodic pattern to be used in digital cameras.
2) It analyzes existing CFAs like Bayer and Hirakawa and outlines limitations like sensitivity to noise and aliasing artifacts.
3) The proposed CFA has a simple demosaicking algorithm that takes advantage of its spectral properties to provide images with higher quality than standard Bayer CFA, with better robustness to aliasing and luminance/chrominance noise.
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...DR.P.S.JAGADEESH KUMAR
The document evaluates the performance of H.264/AVC entropy coding (CABAC) for compressing different types of images. It is observed that CABAC is highly efficient at compressing compound images, achieving higher compression ratios and PSNR values compared to other image types at high bitrates. The proposed system compresses grayscale compound images using CABAC after applying Daubechies wavelet transform. Performance is measured using compression ratio and PSNR metrics at varying bits per pixel.
ROBUST IMAGE WATERMARKING METHOD USING WAVELET TRANSFORMsipij
In this paper a robust watermarking method operating in the wavelet domain for grayscale digital imagesis developed. The method first acomputes the differences between the watermark and the HH1 sub-band ofthe cover image values and then embed these differences in one of the frequency sub-bands. The resultsshow that embedding the watermark in the LH1 sub-band gave the best results. The results were evaluatedusing the RMSE and the PSNR of both the original and the watermarked image. Although the watermarkwas recovered perfectly in the ideal case, the addition of Gaussian noise, or compression of the imageusing JPEG with quality less than 100 destroys the embedded watermark. Different experiments werecarried out to test the performance of the proposed method and good results were obtained.
High Speed and Area Efficient 2D DWT Processor Based Image Compressionsipij
The document describes a proposed high speed and area efficient 2D discrete wavelet transform (DWT) processor design for image compression applications implemented on FPGAs. The design uses a pipelined partially serial architecture to enhance speed while optimally utilizing FPGA resources. Simulation results show the design operating at 231MHz on a Spartan 3 FPGA, a 15% improvement over alternative designs. Resource utilization and speed are improved compared to previous implementations through the optimized DWT processor architecture and FPGA platform choice.
Video Denoising using Transform Domain MethodIRJET Journal
This document presents a proposed method for video denoising using dictionary learning and transform domain techniques. It begins with an abstract describing how traditional video denoising models based on Gaussian noise do not account for real-world noise sources. The proposed method then learns basis functions adaptively from input video frames using dictionary learning, providing a sparse representation. Hard thresholding is applied in the transform domain to compute denoised frames. Experimental results on standard test videos show the method achieves competitive performance compared to other approaches in terms of peak signal-to-noise ratio.
Multiple Binary Images Watermarking in Spatial and Frequency Domainssipij
Editing, reproduction and distribution of the digital multimedia are becoming extremely easier and faster with the existence of the internet and the availability of pervasive and powerful multimedia tools. Digital watermarking has emerged as a possible method to tackle these issues. This paper proposes a scheme using which more data can be inserted into an image in different domains using different techniques. This increases the embedding capacity. Using the proposed scheme 24 binary images can be embedded in the DCT domain and 12 binary images can be embedded in the spatial domain using LSB substitution technique in a single RGB image. The proposed scheme also provides an extra level of security to the watermark image by scrambling the image before embedding it into the host image. Experimental results show that the proposed watermarking method results in almost invisible difference between the watermarked image and the original image and is also robust against various image processing attacks.
Introduction to Video Compression Techniques - Anurag JainVideoguy
The document provides an overview of video compression techniques and standards. It discusses the motivation for video compression to reduce data sizes for storage and transmission. It then reviews several key video compression standards including H.261, H.263, MPEG-1, MPEG-2, MPEG-4, H.264 and others. For each standard, it summarizes the goals, features, applications and technical details like motion compensation methods, block sizes, and bitrate ranges.
COMPARISON OF DENOISING ALGORITHMS FOR DEMOSACING LOW LIGHTING IMAGES USING C...sipij
In modern digital cameras, the Bayer color filter array (CFA) has been widely used. It is also widely known as CFA 1.0. However, Bayer pattern is inferior to the red-green-blue-white (RGBW) pattern, which is also known as CFA 2.0, in low lighting conditions in which Poisson noise is present. It is well known that demosaicing algorithms cannot effectively deal with Poisson noise and additional denoising is needed in order to improve the image quality. In this paper, we propose to evaluate various conventional and deep learning based denoising algorithms for CFA 2.0 in low lighting conditions. We will also investigate the impact of the location of denoising, which refers to whether the denoising is done before or after a critical step of demosaicing. Extensive experiments show that some denoising algorithms can indeed improve the image quality in low lighting conditions. We also noticed that the location of denoising plays an important role in the overall demosaicing performance.
1) The document proposes a new color filter array (CFA) with six colors and a 2x3 periodic pattern to be used in digital cameras.
2) It analyzes existing CFAs like Bayer and Hirakawa and outlines limitations like sensitivity to noise and aliasing artifacts.
3) The proposed CFA has a simple demosaicking algorithm that takes advantage of its spectral properties to provide images with higher quality than standard Bayer CFA, with better robustness to aliasing and luminance/chrominance noise.
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...DR.P.S.JAGADEESH KUMAR
The document evaluates the performance of H.264/AVC entropy coding (CABAC) for compressing different types of images. It is observed that CABAC is highly efficient at compressing compound images, achieving higher compression ratios and PSNR values compared to other image types at high bitrates. The proposed system compresses grayscale compound images using CABAC after applying Daubechies wavelet transform. Performance is measured using compression ratio and PSNR metrics at varying bits per pixel.
ROBUST IMAGE WATERMARKING METHOD USING WAVELET TRANSFORMsipij
In this paper a robust watermarking method operating in the wavelet domain for grayscale digital imagesis developed. The method first acomputes the differences between the watermark and the HH1 sub-band ofthe cover image values and then embed these differences in one of the frequency sub-bands. The resultsshow that embedding the watermark in the LH1 sub-band gave the best results. The results were evaluatedusing the RMSE and the PSNR of both the original and the watermarked image. Although the watermarkwas recovered perfectly in the ideal case, the addition of Gaussian noise, or compression of the imageusing JPEG with quality less than 100 destroys the embedded watermark. Different experiments werecarried out to test the performance of the proposed method and good results were obtained.
Advance Digital Video Watermarking based on DWT-PCA for Copyright protectionIJERA Editor
This document presents a digital video watermarking technique based on discrete wavelet transform (DWT) and principal component analysis (PCA). It begins with an introduction to digital watermarking and an overview of spatial and transform domain watermarking methods. The document then describes DWT and PCA in more detail. It presents a watermarking scheme that uses DWT to decompose video frames into frequency subbands, and embeds a watermark into the principal components of the low frequency subband after applying PCA. Experimental results on a test video show the watermarked frames have no visible quality differences from the original and the watermark is robust to various attacks. The technique achieves imperceptibility measured by high peak signal-to-
DIGITAL WATERMARKING TECHNIQUE BASED ON MULTI-RESOLUTION CURVELET TRANSFORMijfcstjournal
In this paper an efficient & robust non-blind watermarking technique based on multi-resolution geometric analysis named curvelet transform is proposed. Curvelet transform represent edges along curve much more efficiently than the wavelet transform and other traditional transforms. The proposed algorithm of embedding watermark in different scales in curvelet domain is implemented and the results are compared
using proper metric. The visual quality of watermarked image, efficiency of data hiding and the quality of extracted watermark of curvelet domain embedding techniques with wavelet Domain at different number of decomposition levels are compared. Experimental results show that embedding in curvelet domain yields best visual quality in watermarked image, the quality of extracted watermark, robustness of the watermark and the data hiding efficiency.
This document presents a DWT-based video watermarking algorithm that embeds a watermark into randomly selected video frames in the mid-frequency DWT coefficients. The algorithm first divides the video into frames and extracts the blue channel of randomly selected frames. It then applies DWT to the blue channel, extracts the mid-frequency components, and embeds the watermark by modifying these coefficients. After inverse DWT and recombining the color channels, the watermarked frames are assembled into the watermarked video. The algorithm was tested on standard videos and performance was measured using PSNR and MSE between the original and watermarked videos.
This document presents a generalized discrete cosine transform (DCT) decimation scheme for DCT-domain spatial downscaling to improve visual quality. It proposes performing two-fold decimation on flexible-sized subframes larger than 8x8 blocks to reduce aliasing artifacts compared to pixel-domain downscaling schemes. Sparse matrix representations are derived to reduce the computation cost of the proposed DCT decimation method. Experimental results show the proposed approach achieves better visual quality than existing schemes while computational complexity increases but can be optimized by applying larger subframe sizes only to high activity regions.
Investigations on the role of analysis window shape parameter in speech enhan...karthik annam
This document summarizes a research paper that investigates the role of analysis window shape parameter in speech enhancement using a hybrid method. The paper analyzes how different window shapes (e.g. Hamming, Gaussian, Kaiser) impact speech enhancement when used to segment noisy speech frames. Six objective measures are used to evaluate enhanced speech quality for different window shapes. Experimental results show that window shape plays an important role in the enhancement process. An optimum shape constant is proposed to achieve better speech quality. Two new hybrid thresholding schemes combining different thresholding methods are also proposed and evaluated.
The document discusses different types of video compression standards including MPEG, H.261, H.263, and JPEG. It explains key concepts in video compression like frame rate, color resolution, spatial resolution, and image quality. MPEG standards like MPEG-1, MPEG-2, MPEG-4, and MPEG-7 are defined for compressing video and audio at different bit rates. Techniques like spatial and temporal redundancy reduction are used to compress video frames and consecutive frames. Compression reduces file sizes but can cause data loss during transmission.
Performance Analysis of Acoustic Echo Cancellation TechniquesIJERA Editor
Mainly, the adaptive filters are implemented in time domain which works efficiently in most of the applications. But in many applications the impulse response becomes too large, which increases the complexity of the adaptive filter beyond a level where it can no longer be implemented efficiently in time domain. An example of where this can happen would be acoustic echo cancellation (AEC) applications. So, there exists an alternative solution i.e. to implement the filters in frequency domain. AEC has so many applications in wide variety of problems in industrial operations, manufacturing and consumer products. Here in this paper, a comparative analysis of different acoustic echo cancellation techniques i.e. Frequency domain adaptive filter (FDAF), Least mean square (LMS), Normalized least mean square (NLMS) &Sign error (SE) is presented. The results are compared with different values of step sizes and the performance of these techniques is measured in terms of Error rate loss enhancement (ERLE), Mean square error (MSE)& Peak signal to noise ratio (PSNR).
here it introduces an efficient multi-resolution watermarking methodology for copyright protection of digital images. By adapting the watermark signal to the wavelet coefficients, the proposed method is highly image adaptive and the watermark signal can be strengthen in the most significant parts of the image. As this property also increases the watermark visibility, usage of the human visual system is incorporated to prevent perceptual visibility of embedded watermark signal. Experimental results show that the proposed system preserves the image quality and is vulnerable against most common image processing distortions. Furthermore, the hierarchical nature of wavelet transform allows for detection of watermark at various resolutions, resulting in reduction of the computational load needed for watermark detection based on the noise level. The performance of the proposed system is shown to be superior to that of other available schemes reported in the literature.
This document discusses polyphase filters and their applications. It begins with an introduction to polyphase filters and how they are used for decimation and interpolation in digital signal processing applications. Next, it describes active and passive polyphase filter implementations and their advantages. Finally, it discusses some example applications of polyphase filters, including predistortion for linearization of nonlinear systems and nonlinear echo cancellation using Volterra filters.
This document describes a hybrid technique for image enhancement that uses both frequency domain and spatial domain techniques. It begins with applying frequency domain techniques like discrete cosine transform (DCT) or discrete wavelet transform (DWT) to separate an image into magnitude and phase spectra. The magnitude is then enhanced before recombining it with the phase using inverse DCT/DWT. Spatial domain techniques like power law or log transforms are then applied to further enhance contrast and brightness. The technique is evaluated on sample images and shown to achieve better PSNR and lower MSE than frequency domain techniques alone. In conclusion, combining frequency and spatial domain methods provides an effective approach for image enhancement.
The document discusses various techniques for video compression, including reducing spatial, temporal, and spectral redundancy. It covers algorithms like DCT, VQ, and fractal compression. Key aspects of video compression standards like MPEG-1, MPEG-2, H.264 and techniques like motion estimation and motion compensated prediction are summarized. Current and developing video coding standards and their applications are also outlined.
MPEG is a video compression standard developed in the late 1980s to enable full-motion video over networks and storage mediums. It was created by the Motion Picture Experts Group to address the need for high compression ratios to transmit video given bandwidth limitations of the time. MPEG uses spatial and temporal redundancy reduction techniques like discrete cosine transformation, quantization, and entropy coding to compress video frames and take advantage of similarities between neighboring pixels and successive frames. It defines a group of pictures structure and different frame types like I, P, and B frames to enable features like random access while maintaining synchronization and error robustness. MPEG became widely adopted and evolved through standards like MPEG-1, MPEG-2, and MPEG
Image Interpolation Techniques in Digital Image Processing: An OverviewIJERA Editor
This document provides an overview of various image interpolation techniques, including nearest neighbor, bilinear, bicubic, B-spline, Lanczos, discrete wavelet transform (DWT), and Kriging. It analyzes the performance of each technique and shows that bicubic interpolation produces better results than nearest neighbor and bilinear by using polynomials. DWT and Kriging provide finer image details compared to other methods. Kriging uses a weighted average approach based on distance between points, rather than arbitrary weighting functions used in other techniques. The document includes examples applying different interpolation methods and analyzing their outputs.
1) Digital lensless holographic microscopy (DLHM) suffers from twin images and strong zero diffraction orders that reduce contrast in reconstructed images of biological samples. 2) The authors propose a numerical dark field illumination method to attenuate these issues by inserting an opaque circular stop in the light path during reconstruction. 3) On an experimental DLHM hologram of a microorganism, the proposed method improved contrast by 68% and enhanced detail visibility compared to regular DLHM reconstruction.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
The document summarizes speech recognition front-end technologies including voice activity detection (VAD) and speech enhancement. It discusses conventional signal processing based approaches and more recent deep learning based methods. For VAD, it describes adaptive context attention models that can dynamically adjust the context used based on noise type and SNR. For speech enhancement, it proposes a two-step neural network approach consisting of a prior network that makes multiple predictions from noisy features and a post network that combines these using a boosting method to produce enhanced features, allowing end-to-end training without an explicit masking step. The approach aims to better exploit neural network modeling power while reducing computation cost compared to conventional methods or single-step deep learning frameworks.
Modified DCT-based Audio Watermarking Optimization using Genetics AlgorithmTELKOMNIKA JOURNAL
Ease process digital data information exchange impact on the increase in cases of copyright
infringement. Audio watermarking is one solution in providing protection for the owner of the work. This
research aims to optimize the insertion parameters on Modified Discrete Cosine Transform (M-DCT) based
audio watermarking using a genetic algorithm, to produce better audio resistance. MDCT is applied after
reading host audio, then embedding in MDCT domain is applied by Quantization Index Modulation (QIM)
technique. Insertion within the MDCT domain is capable of generating a high imperceptible watermarked
audio due to its overlapping frame system. The system is optimized using genetic algorithms to improve
the value of imperceptibility and robustness in audio watermarking. In this research, the average SNR
reaches 20 dB, and ODG reaches -0.062. The subjective quality testing on the system obtains an average
MOS of 4.22 out of five songs tested. In addition, the system is able to withstand several attacks. The use
of M-DCT in audio watermaking is capable of producing excellent imperceptibility and better watermark
robustness.
Introduction to Digital Videos, Motion Estimation: Principles & Compensation. Learn more in IIT Kharagpur's Image and Video Communication online certificate course.
Review of Diverse Techniques Used for Effective Fractal Image CompressionIRJET Journal
This document reviews different techniques for fractal image compression to enhance compression ratio while maintaining image quality. It discusses algorithms like quadtree partitioning with Huffman coding (QPHC), discrete cosine transform based fractal image compression (DCT-FIC), discrete wavelet transform based fractal image compression (DWTFIC), and Grover's quantum search algorithm based fractal image compression (QAFIC). The document also analyzes works applying these techniques and concludes that combining QAFIC with the tiny block size processing algorithm may further improve compression ratio with minimal quality loss.
Video Compression Algorithm Based on Frame Difference Approaches ijsc
The huge usage of digital multimedia via communications, wireless communications, Internet, Intranet and cellular mobile leads to incurable growth of data flow through these Media. The researchers go deep in developing efficient techniques in these fields such as compression of data, image and video. Recently, video compression techniques and their applications in many areas (educational, agriculture, medical …) cause this field to be one of the most interested fields. Wavelet transform is an efficient method that can be used to perform an efficient compression technique. This work deals with the developing of an efficient video compression approach based on frames difference approaches that concentrated on the calculation of frame near distance (difference between frames). The
selection of the meaningful frame depends on many factors such as compression performance, frame details, frame size and near distance between frames. Three different approaches are applied for removing the lowest frame difference. In this paper, many videos are tested to insure the efficiency of this technique, in addition a good performance results has been obtained.
Efficient bitrate ladder construction for live video streamingMinh Nguyen
In live streaming applications, service providers generally use a bitrate ladder with fixed bitrate-resolution pairs instead of optimizing it per title to avoid the additional latency caused to find optimum bitrate-resolution pairs for every video content. This paper introduces an online bitrate ladder construction scheme for live video streaming applications. In this scheme, each target bitrate's optimized resolution is determined from any pre-defined set of resolutions using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for each video segment. Experimental results show that, on average, the proposed scheme yields significant bitrate savings while maintaining the same quality, compared to the HLS fixed bitrate ladder scheme without any noticeable additional latency in streaming.
Advance Digital Video Watermarking based on DWT-PCA for Copyright protectionIJERA Editor
This document presents a digital video watermarking technique based on discrete wavelet transform (DWT) and principal component analysis (PCA). It begins with an introduction to digital watermarking and an overview of spatial and transform domain watermarking methods. The document then describes DWT and PCA in more detail. It presents a watermarking scheme that uses DWT to decompose video frames into frequency subbands, and embeds a watermark into the principal components of the low frequency subband after applying PCA. Experimental results on a test video show the watermarked frames have no visible quality differences from the original and the watermark is robust to various attacks. The technique achieves imperceptibility measured by high peak signal-to-
DIGITAL WATERMARKING TECHNIQUE BASED ON MULTI-RESOLUTION CURVELET TRANSFORMijfcstjournal
In this paper an efficient & robust non-blind watermarking technique based on multi-resolution geometric analysis named curvelet transform is proposed. Curvelet transform represent edges along curve much more efficiently than the wavelet transform and other traditional transforms. The proposed algorithm of embedding watermark in different scales in curvelet domain is implemented and the results are compared
using proper metric. The visual quality of watermarked image, efficiency of data hiding and the quality of extracted watermark of curvelet domain embedding techniques with wavelet Domain at different number of decomposition levels are compared. Experimental results show that embedding in curvelet domain yields best visual quality in watermarked image, the quality of extracted watermark, robustness of the watermark and the data hiding efficiency.
This document presents a DWT-based video watermarking algorithm that embeds a watermark into randomly selected video frames in the mid-frequency DWT coefficients. The algorithm first divides the video into frames and extracts the blue channel of randomly selected frames. It then applies DWT to the blue channel, extracts the mid-frequency components, and embeds the watermark by modifying these coefficients. After inverse DWT and recombining the color channels, the watermarked frames are assembled into the watermarked video. The algorithm was tested on standard videos and performance was measured using PSNR and MSE between the original and watermarked videos.
This document presents a generalized discrete cosine transform (DCT) decimation scheme for DCT-domain spatial downscaling to improve visual quality. It proposes performing two-fold decimation on flexible-sized subframes larger than 8x8 blocks to reduce aliasing artifacts compared to pixel-domain downscaling schemes. Sparse matrix representations are derived to reduce the computation cost of the proposed DCT decimation method. Experimental results show the proposed approach achieves better visual quality than existing schemes while computational complexity increases but can be optimized by applying larger subframe sizes only to high activity regions.
Investigations on the role of analysis window shape parameter in speech enhan...karthik annam
This document summarizes a research paper that investigates the role of analysis window shape parameter in speech enhancement using a hybrid method. The paper analyzes how different window shapes (e.g. Hamming, Gaussian, Kaiser) impact speech enhancement when used to segment noisy speech frames. Six objective measures are used to evaluate enhanced speech quality for different window shapes. Experimental results show that window shape plays an important role in the enhancement process. An optimum shape constant is proposed to achieve better speech quality. Two new hybrid thresholding schemes combining different thresholding methods are also proposed and evaluated.
The document discusses different types of video compression standards including MPEG, H.261, H.263, and JPEG. It explains key concepts in video compression like frame rate, color resolution, spatial resolution, and image quality. MPEG standards like MPEG-1, MPEG-2, MPEG-4, and MPEG-7 are defined for compressing video and audio at different bit rates. Techniques like spatial and temporal redundancy reduction are used to compress video frames and consecutive frames. Compression reduces file sizes but can cause data loss during transmission.
Performance Analysis of Acoustic Echo Cancellation TechniquesIJERA Editor
Mainly, the adaptive filters are implemented in time domain which works efficiently in most of the applications. But in many applications the impulse response becomes too large, which increases the complexity of the adaptive filter beyond a level where it can no longer be implemented efficiently in time domain. An example of where this can happen would be acoustic echo cancellation (AEC) applications. So, there exists an alternative solution i.e. to implement the filters in frequency domain. AEC has so many applications in wide variety of problems in industrial operations, manufacturing and consumer products. Here in this paper, a comparative analysis of different acoustic echo cancellation techniques i.e. Frequency domain adaptive filter (FDAF), Least mean square (LMS), Normalized least mean square (NLMS) &Sign error (SE) is presented. The results are compared with different values of step sizes and the performance of these techniques is measured in terms of Error rate loss enhancement (ERLE), Mean square error (MSE)& Peak signal to noise ratio (PSNR).
here it introduces an efficient multi-resolution watermarking methodology for copyright protection of digital images. By adapting the watermark signal to the wavelet coefficients, the proposed method is highly image adaptive and the watermark signal can be strengthen in the most significant parts of the image. As this property also increases the watermark visibility, usage of the human visual system is incorporated to prevent perceptual visibility of embedded watermark signal. Experimental results show that the proposed system preserves the image quality and is vulnerable against most common image processing distortions. Furthermore, the hierarchical nature of wavelet transform allows for detection of watermark at various resolutions, resulting in reduction of the computational load needed for watermark detection based on the noise level. The performance of the proposed system is shown to be superior to that of other available schemes reported in the literature.
This document discusses polyphase filters and their applications. It begins with an introduction to polyphase filters and how they are used for decimation and interpolation in digital signal processing applications. Next, it describes active and passive polyphase filter implementations and their advantages. Finally, it discusses some example applications of polyphase filters, including predistortion for linearization of nonlinear systems and nonlinear echo cancellation using Volterra filters.
This document describes a hybrid technique for image enhancement that uses both frequency domain and spatial domain techniques. It begins with applying frequency domain techniques like discrete cosine transform (DCT) or discrete wavelet transform (DWT) to separate an image into magnitude and phase spectra. The magnitude is then enhanced before recombining it with the phase using inverse DCT/DWT. Spatial domain techniques like power law or log transforms are then applied to further enhance contrast and brightness. The technique is evaluated on sample images and shown to achieve better PSNR and lower MSE than frequency domain techniques alone. In conclusion, combining frequency and spatial domain methods provides an effective approach for image enhancement.
The document discusses various techniques for video compression, including reducing spatial, temporal, and spectral redundancy. It covers algorithms like DCT, VQ, and fractal compression. Key aspects of video compression standards like MPEG-1, MPEG-2, H.264 and techniques like motion estimation and motion compensated prediction are summarized. Current and developing video coding standards and their applications are also outlined.
MPEG is a video compression standard developed in the late 1980s to enable full-motion video over networks and storage mediums. It was created by the Motion Picture Experts Group to address the need for high compression ratios to transmit video given bandwidth limitations of the time. MPEG uses spatial and temporal redundancy reduction techniques like discrete cosine transformation, quantization, and entropy coding to compress video frames and take advantage of similarities between neighboring pixels and successive frames. It defines a group of pictures structure and different frame types like I, P, and B frames to enable features like random access while maintaining synchronization and error robustness. MPEG became widely adopted and evolved through standards like MPEG-1, MPEG-2, and MPEG
Image Interpolation Techniques in Digital Image Processing: An OverviewIJERA Editor
This document provides an overview of various image interpolation techniques, including nearest neighbor, bilinear, bicubic, B-spline, Lanczos, discrete wavelet transform (DWT), and Kriging. It analyzes the performance of each technique and shows that bicubic interpolation produces better results than nearest neighbor and bilinear by using polynomials. DWT and Kriging provide finer image details compared to other methods. Kriging uses a weighted average approach based on distance between points, rather than arbitrary weighting functions used in other techniques. The document includes examples applying different interpolation methods and analyzing their outputs.
1) Digital lensless holographic microscopy (DLHM) suffers from twin images and strong zero diffraction orders that reduce contrast in reconstructed images of biological samples. 2) The authors propose a numerical dark field illumination method to attenuate these issues by inserting an opaque circular stop in the light path during reconstruction. 3) On an experimental DLHM hologram of a microorganism, the proposed method improved contrast by 68% and enhanced detail visibility compared to regular DLHM reconstruction.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
The document summarizes speech recognition front-end technologies including voice activity detection (VAD) and speech enhancement. It discusses conventional signal processing based approaches and more recent deep learning based methods. For VAD, it describes adaptive context attention models that can dynamically adjust the context used based on noise type and SNR. For speech enhancement, it proposes a two-step neural network approach consisting of a prior network that makes multiple predictions from noisy features and a post network that combines these using a boosting method to produce enhanced features, allowing end-to-end training without an explicit masking step. The approach aims to better exploit neural network modeling power while reducing computation cost compared to conventional methods or single-step deep learning frameworks.
Modified DCT-based Audio Watermarking Optimization using Genetics AlgorithmTELKOMNIKA JOURNAL
Ease process digital data information exchange impact on the increase in cases of copyright
infringement. Audio watermarking is one solution in providing protection for the owner of the work. This
research aims to optimize the insertion parameters on Modified Discrete Cosine Transform (M-DCT) based
audio watermarking using a genetic algorithm, to produce better audio resistance. MDCT is applied after
reading host audio, then embedding in MDCT domain is applied by Quantization Index Modulation (QIM)
technique. Insertion within the MDCT domain is capable of generating a high imperceptible watermarked
audio due to its overlapping frame system. The system is optimized using genetic algorithms to improve
the value of imperceptibility and robustness in audio watermarking. In this research, the average SNR
reaches 20 dB, and ODG reaches -0.062. The subjective quality testing on the system obtains an average
MOS of 4.22 out of five songs tested. In addition, the system is able to withstand several attacks. The use
of M-DCT in audio watermaking is capable of producing excellent imperceptibility and better watermark
robustness.
Introduction to Digital Videos, Motion Estimation: Principles & Compensation. Learn more in IIT Kharagpur's Image and Video Communication online certificate course.
Review of Diverse Techniques Used for Effective Fractal Image CompressionIRJET Journal
This document reviews different techniques for fractal image compression to enhance compression ratio while maintaining image quality. It discusses algorithms like quadtree partitioning with Huffman coding (QPHC), discrete cosine transform based fractal image compression (DCT-FIC), discrete wavelet transform based fractal image compression (DWTFIC), and Grover's quantum search algorithm based fractal image compression (QAFIC). The document also analyzes works applying these techniques and concludes that combining QAFIC with the tiny block size processing algorithm may further improve compression ratio with minimal quality loss.
Video Compression Algorithm Based on Frame Difference Approaches ijsc
The huge usage of digital multimedia via communications, wireless communications, Internet, Intranet and cellular mobile leads to incurable growth of data flow through these Media. The researchers go deep in developing efficient techniques in these fields such as compression of data, image and video. Recently, video compression techniques and their applications in many areas (educational, agriculture, medical …) cause this field to be one of the most interested fields. Wavelet transform is an efficient method that can be used to perform an efficient compression technique. This work deals with the developing of an efficient video compression approach based on frames difference approaches that concentrated on the calculation of frame near distance (difference between frames). The
selection of the meaningful frame depends on many factors such as compression performance, frame details, frame size and near distance between frames. Three different approaches are applied for removing the lowest frame difference. In this paper, many videos are tested to insure the efficiency of this technique, in addition a good performance results has been obtained.
Efficient bitrate ladder construction for live video streamingMinh Nguyen
In live streaming applications, service providers generally use a bitrate ladder with fixed bitrate-resolution pairs instead of optimizing it per title to avoid the additional latency caused to find optimum bitrate-resolution pairs for every video content. This paper introduces an online bitrate ladder construction scheme for live video streaming applications. In this scheme, each target bitrate's optimized resolution is determined from any pre-defined set of resolutions using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for each video segment. Experimental results show that, on average, the proposed scheme yields significant bitrate savings while maintaining the same quality, compared to the HLS fixed bitrate ladder scheme without any noticeable additional latency in streaming.
In the current scenario compression of video files is in high demand. Color video compression has become a significant technology to lessen the memory space and to decrease transmission time. Video compression using fractal technique is based on self similarity concept by comparing the range block and domain block. However, its computational complexity is very high. In this paper we presented hybrid video compression technique to compress Audio/Video Interleaved file and overcome the problem of Computational complexity. We implemented Discrete Wavelet Transform and hybrid fractal HV partition technique using Particle Swarm Optimization (called mapping of PSO) for compression of videos. The analysis demonstrate that hybrid technique gives a very good speed up to compress video and achieve Peak Signal to Noise Ratio.
IMPROVEMENT OF BM3D ALGORITHM AND EMPLOYMENT TO SATELLITE AND CFA IMAGES DENO...ijistjournal
This paper proposes a new procedure in order to improve the performance of block matching and 3-D filtering (BM3D) image denoising algorithm. It is demonstrated that it is possible to achieve a better performance than that of BM3D algorithm in a variety of noise levels. This method changes BM3D algorithm parameter values according to noise level, removes prefiltering, which is used in high noise level; therefore Peak Signal-to-Noise Ratio (PSNR) and visual quality get improved, and BM3D complexities and processing time are reduced. This improved BM3D algorithm is extended and used to denoise satellite and color filter array (CFA) images. Output results show that the performance has upgraded in comparison with current methods of denoising satellite and CFA images. In this regard this algorithm is compared with Adaptive PCA algorithm, that has led to superior performance for denoising CFA images, on the subject of PSNR and visual quality. Also the processing time has decreased significantly.
IMPROVEMENT OF BM3D ALGORITHM AND EMPLOYMENT TO SATELLITE AND CFA IMAGES DENO...ijistjournal
This paper proposes a new procedure in order to improve the performance of block matching and 3-D filtering (BM3D) image denoising algorithm. It is demonstrated that it is possible to achieve a better performance than that of BM3D algorithm in a variety of noise levels. This method changes BM3D algorithm parameter values according to noise level, removes prefiltering, which is used in high noise level; therefore Peak Signal-to-Noise Ratio (PSNR) and visual quality get improved, and BM3D complexities and processing time are reduced. This improved BM3D algorithm is extended and used to denoise satellite and color filter array (CFA) images. Output results show that the performance has upgraded in comparison with current methods of denoising satellite and CFA images. In this regard this algorithm is compared with Adaptive PCA algorithm, that has led to superior performance for denoising CFA images, on the subject of PSNR and visual quality. Also the processing time has decreased significantly.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGcscpconf
Video compression is one of the most important blocks of an image acquisition system.
Compression of video results in reduction of transmission bandwidth. In real time video
compression the incoming video data is directly compressed without being stored first.
Therefore real time video compression system operates under stringent timing constraints.
Current video compression standards like MPEG, H.26x series, involve emotion estimation and
compensation blocks which are highly computationally expensive and hence they are not
suitable for real time applications on resource scarce systems. Current applications like video
calling, video conferencing require low complexity video compression algorithms so that they
can be implemented in environments that have scarce computational resources (like mobile
phones). A low complexity video compression algorithm based on 2D SVD exists. In this paper, a modification to that algorithm which provides higher PSNR at the same bit rate is presented.
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...theijes
In the multimedia environment, digital data has gained more importance in daily routine. Large volume of videos such as entertainment video, news video, cartoon video, sports video is accessed by masses to accomplish their different needs. In the field of video processing Shot boundary detection is current research area. Shot boundary detection has vast impact on effective browsing and retrieving, searching of video. It serves as the beginning to construct the content of videos. Video processing technology has crucial job to provide valid information from videos without loss of any information. This paper is a survey of various novel algorithm for detecting fade-in and fade-out used by renowned personals with different methods. This survey also emphasizes on different core concepts underlying the different detection schemes for the most used video transition effect: fades
A Novel Method for Sensing Obscene Videos using Scene Change DetectionRadita Apriana
Video scene change detection has great importance of managing and analyzing large amount of
videos. Traditionally this technique used for indexing, segmenting and categorizing different types of
videos. Very few works addressed to classify obscene using scene change detection method. In this
research we proposed a simple approach for sensing objectionable videos by observing scene changes
into different video genres. Video scenes are grouped into set of key frames. After analyzing duration of
each scene and counting the number of key frames of designated scene, it has been shown that obscene
videos have infrequent scene changing nature. While in sports, dramas, music and action films have large
number of scene changes. We used six types of video genres and the decision has been made by setting
a threshold based on extracted key frames. Experimental result showed that the accuracy is 83.33% and
false positive rate is 16.67%.
Propose shot boundary detection methods by using visual hybrid featuresIJECEIAES
Shot boundary detection is the fundamental technique that plays an important role in a variety of video processing tasks such as summarization, retrieval, object tracking, and so on. This technique involves segmenting a video sequence into shots, each of which is a sequence of interrelated temporal frames. This paper introduces two methods, where the first is for detecting the cut shot boundary via employing visual hybrid features, while the second method is to compare between them. This enhances the effectiveness of the performance of detecting the shot by selecting the strongest features. The first method was performed by utilizing hybrid features, which included statistics histogram of hue-saturation-value color space and grey level co-occurrence matrix. The second method was performed by utilizing hybrid features that include discrete wavelet transform and grey level co-occurrence matrix. The frame size decreased. This process had the advantage of reducing the computation time. Also used local adaptive thresholds, which enhanced the method’s performance. The tested videos were obtained from the BBC archive, which included BBC Learning English and BBC News. Experimental results have indicated that the second method has achieved (97.618%) accuracy performance, which was higher than the first and other methods using evaluation metrics.
Dynamic Texture Coding using Modified Haar Wavelet with CUDAIJERA Editor
Texture is an image having repetition of patterns. There are two types, static and dynamic texture. Static texture is an image having repetitions of patterns in the spatial domain. Dynamic texture is number of frames having repetitions in spatial and temporal domain. This paper introduces a novel method for dynamic texture coding to achieve higher compression ratio of dynamic texture using 2D-modified Haar wavelet transform. The dynamic texture video contains high redundant parts in spatial and temporal domain. Redundant parts can be removed to achieve high compression ratios with better visual quality. The modified Haar wavelet is used to exploit spatial and temporal correlations amongst the pixels. The YCbCr color model is used to exploit chromatic components as HVS is less sensitive to chrominance. To decrease the time complexity of algorithm parallel programming is done using CUDA (Compute Unified Device Architecture). GPU contains the number of cores as compared to CPU, which is utilized to reduce the time complexity of algorithms.
International Journal of Engineering Research and DevelopmentIJERD Editor
The document summarizes an emerging VP8 video codec that is designed for mobile devices. It aims to significantly reduce computational complexity through several techniques while maintaining good video quality. The key techniques include a predictive algorithm for motion estimation that reduces computation by 18.5-20x compared to full search, using integer discrete cosine transform instead of floating point to achieve 2.6-3.5x speed improvement, and skipping DCT and quantization for some macroblocks to reduce computations. Experimental results on test sequences show negligible quality degradation of 0.2-0.5dB for integer DCT and 0.5dB on average for the full codec, while achieving real-time encoding rates on mobile devices. The proposed low-complexity
An overview Survey on Various Video compressions and its importanceINFOGAIN PUBLICATION
With the rise of digital computing and visual data processing, the need for storage and transmission of video data became prevalent. Storage and transmission of uncompressed raw visual data is not a good practice, because it requires a large storage space and great bandwidth. Video compression algorithms can compress this raw visual data or video into smaller files with a little sacrifice on the quality. This paper an overview and comparison of standard efforts on video compression algorithm of: MPEG-1, MPEG-2, MPEG-4, MPEG-7
A Hybrid DWT-SVD Method for Digital Video Watermarking Using Random Frame Sel...researchinventy
This document presents a hybrid DWT-SVD method for digital video watermarking using random frame selection. The proposed method embeds a watermark into randomly selected video frames by applying discrete wavelet transform and singular value decomposition. The blue channel of selected frames is used for watermark embedding in the mid-frequency DWT coefficients. Experimental results show the method provides good imperceptibility and robustness against various attacks like compression, cropping, noise addition, contrast changes and tampering. The normalization coefficient between original and extracted watermarks is used to evaluate the performance under different attacks.
Performance and Analysis of Video Compression Using Block Based Singular Valu...IJMER
This document presents an analysis of low-complexity video compression using block-based singular value decomposition (SVD) algorithms. It begins with an introduction to video compression and its importance for reducing storage and transmission costs. Current video compression standards like MPEG and H.26x are computationally expensive, making them unsuitable for real-time applications. The document then discusses block SVD algorithms as an alternative that can provide higher quality compression at lower computational complexity. It analyzes reducing the time complexity of video compression using block SVD and compares it to other compression methods. The document outlines the SVD decomposition process and how a 2D version can be applied to groups of image blocks for more efficient compression than 1D SVD.
SECURED COLOR IMAGE WATERMARKING TECHNIQUE IN DWT-DCT DOMAIN ijcseit
The multilayer secured DWT-DCT and YIQ color space based image watermarking technique with
robustness and better correlation is presented here. The security levels are increased by using multiple pn
sequences, Arnold scrambling, DWT domain, DCT domain and color space conversions. Peak signal to
noise ratio and Normalized correlations are used as measurement metrics. The 512x512 sized color images
with different histograms are used for testing and watermark of size 64x64 is embedded in HL region of
DWT and 4x4 DCT is used. ‘Haar’ wavelet is used for decomposition and direct flexing factor is used. We
got PSNR value is 63.9988 for flexing factor k=1 for Lena image and the maximum NC 0.9781 for flexing
factor k=4 in Q color space. The comparative performance in Y, I and Q color space is presented. The
technique is robust for different attacks like scaling, compression, rotation etc.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Review On Fractal Image Compression TechniquesIRJET Journal
This document reviews different techniques for fractal image compression. It discusses how fractal image compression works by dividing an image into range and domain blocks. It then summarizes several papers that propose techniques for fractal image compression using discrete cosine transform (DCT) or discrete wavelet transform (DWT). These techniques aim to improve compression ratio and reduce encoding time. Finally, the document proposes a new method that combines wavelets with fractal image compression to further increase compression ratio while maintaining low image quality losses during decompression.
Similar to Empirical Evaluation of Decomposition Strategy for Wavelet Video Compression (20)
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Leveraging Generative AI to Drive Nonprofit Innovation
Empirical Evaluation of Decomposition Strategy for Wavelet Video Compression
1. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 31
Empirical Evaluation of Decomposition Strategy for Wavelet
Video Compression
Rohmad Fakeh rohmad@rtm.gov.my
Engineering Division
Department of Broadcasting
Radio Television Malaysia
Abdul Azim Abd Ghani azim@fsktm.upm.edu.my
Faculty of Computer Science and Information Technology
Universiti Putra Malaysia
43400 Serdang, Selangor, Malaysia
ABSTRACT
The wavelet transform has become the most interesting new algorithm for video
compression. Yet there are many parameters within a wavelet analysis and
synthesis which govern the quality of a decoded video. In this paper different
wavelet decomposition strategies and their implications for the decoded video
are discussed. A pool of color video sequences has been wavelet-transformed at
different settings of the wavelet filter bank and quantization threshold and with
decomposition of dyadic and packet wavelet transformation strategies. The
empirical evaluation of the decomposition strategy is based on three
benchmarks: a first judgment regards the perceived quality of the decoded video.
The compression rate is a second crucial factor, and finally the best parameter
setting with regards to the Peak Signal to Noise Ratio (PSNR). The investigation
proposes dyadic decomposition as the chosen decomposition strategy.
Keywords: Wavelet Analysis, Decomposition Strategies, Empirical Evaluation
1. INTRODUCTION
Wavelet technology has provided an efficient framework of multi-resolution space-frequency
representation with promising applications in video processing. Discrete wavelet transform (DWT)
is becoming increasingly important in visual applications because of its flexibility in representing
non-stationary signals such as images and video sequences.
Applying 3D wavelet transform to digital video is a logical extension to the 2D analysis [20]. Most
video compression techniques use 2D coding to achieve spatial compression and motion
compensated difference coding in the time domain. Most of these techniques involved
complicated and expensive hardware. By applying the wavelet transform in all the three
dimensions, the computational complexity of coding while achieving high rates of compression
can be reduced, depending on the coding strategy [1].
The choice of coding strategy has been reported by many researchers [19], [20], [22], [6], [5], [1].
Basically the three-dimensional wavelet decomposition can be performed in three ways: temporal
filtering followed by two-dimensional spatial filtering known as (t+2D) [19], [20], [27], [22], [6], two-
dimensional spatial filtering followed by temporal filtering (2D+t) [30]. The third approach is
introduced for scalable video coding (SVC) is 2D+t+2D uses a first stage DWT to produce
reference video sequence at various resolution; t+2D transform are then performed on each
resolution level of the obtained spatial pyramid [1].
2. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 32
The research in t+2D orientation of the 3D spatio-temporal wavelet video coding one important
advantage is that it avoids motion estimation and motion compensation which are generally very
difficult task where the motion parameters are usually sensitive to transmission errors, include low
computational complexity [22]. The research on t+2D was on low bit rate wavelet image and
video compression with adaptive quantization, coding and post processing. A video signal is
decomposed into temporal and spatial frequency sub bands using temporal and spatial bandpass
analysis filter-banks. The computational burden of the 3D sub band video coding is minimized by
decomposing the video signal of temporal decomposition based on 2-tap Haar filter-bank
basically the difference and average between frames [24], [28].
However this research uses traditional multi-tap FIR filterbanks such as the 9-tap QMF filter of
Adelson [2] and perfect-reconstruction FIR filters of Daubechies [9], [10], [11]. This spatio-
temporal wavelet transformation which produces fixed and limited to 11 sub-bands tree-structured
spatio-temporal decomposition [22] as not optimum and not flexible since the limited level of
penetration is limited to two for low-pass-temporal and one level for high-pass temporal.
Along the same t+2D orientation using Haar filter for the temporal sub band filtering, research by
Ashourian et al. [6] on Robust 3-D sub band video coder has followed the same orientation as
done by Luo [22]. The research uses traditional multi-tap FIR filterbanks such as the 9-tap QMF
filter of Adelson [2] and spatio-temporal wavelet transformation produces 11 sub-bands. The
research also applies different types of quantization depends on the statistics of each of the 11
sub bands which is not optimum in video coding performance, and the high-pass sub bands are
quantized using a variant of vector quantization.
The rest of the paper is organized as follows: Section 2 discusses related works. Section 3
presents the proposed 3D wavelet video compression scheme. Section 4 contains performance
measures, and Section 5 presents results of decomposition strategy. Section 6 discusses the
outcomes from decomposition strategy. Finally, the conclusions are mentioned in Section 7.
2. RELATED WORKS
Research reported by Luo [22] was on low bit rate wavelet image and video compression with
adaptive quantization, coding and post processing. A video signal was decomposed into temporal
and spatial frequency subbands using temporal and spatial bandpass analysis filter-banks.
According to Karlsson et. al [19], the computational burden of the 3D sub band video coding is
minimized by decomposing the video signal of temporal decomposition based on 2-tap Haar filter-
bank, basically the difference and average between frames [24], [28]. This also minimizes the
number of frames needed to be stored and the delay caused by the analysis and synthesis
procedures. In the case of spatial decomposition, longer length of filters can be applied since
these filters can be operated in parallel.
The research uses traditional multi-tap FIR filterbanks such as the 9-tap QMF filter of Adelson [2]
and perfect-reconstruction FIR filters of Daubechies [9], [10], [11]. This spatio-temporal wavelet
transformation produces a fixed and limited an 11 sub-bands tree-structured spatio-temporal
decomposition as in Figure 1. The template for displaying the 11-band decomposition is as in
Figure 2. The sub bands produced correspond to penetration depth or decomposition level of two
to the Temporal Low-pass and one level to the Temporal High-Pass sub-bands.
The general strategies for the quantization and coding algorithm on the characteristics of the 11
sub bands are as follows:
1. Sub bands at courser scale levels with small index in Figure 2 with most significant
energy and higher visual significance requires relatively higher quality coding and finer
quantization.
2. Sub bands at finer scale levels are quantized more coarsely, or can be discarded.
3. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 33
FIGURE 1: An 11 sub-bands tree-structured spatio-temporal decomposition
FIGURE 2: Template for displaying the 11-band decomposition
Since images are of finite support, the research by Luo [22] used several applicable extension
methods, including zero padding and symmetric extension. Although the best extension method
may be image dependent, in most cases, appropriate symmetric extension yield optimum result.
The video sequences used are in CIF formats (360x288), with frame rate of 15fps, which are
typically used in videoconferencing through ISDN channels. There is no direct comparison to the
performance of the 3D methods used since the experimental results are on the entropy reduction
for Lena image and the extent of the adaptive quantization with results in comparison to EZW
coding. The report proposes a fixed number of sub bands with represents a decomposition of
level 2 and this is might not be optimal for rate-distortion point of view since the lowest frequency
sub bands can further be decomposed in a tree-structured fashion to achieve higher compression
ratio.
Research by Ashourian [6] on robust 3-D sub band video coder has followed the same orientation
as done by Luo [22]. The research also applies different types of quantization depends on the
statistics of each of the 11 sub bands and the high-pass sub bands are quantized using a variant
of vector quantization. The results of the simulation over un-noisy channels are as in Table 1.
Claire
Miss-
America
Salesman Suzie Carphone
Bitrate
(kbits/s)
Average PSNR [dB
62.0 37.0 38.5 30.2 33.5 30.0
113.0 39.5 42.0 32.8 34.9 32.2
355.0 42.1 43.8 37.6 38.6 36.9
TABLE 1: Performance comparison (PSNR, [dB] of research by Ashourian [6] at frame rate of
7.5fps, and video bitrate of 62, 113 and 355 kbps.
3. PROPOSED 3D WAVELET VIDEO COMPRESSION SCHEMEThe proposed wavelet 3D
video coding methods is outlined in Figure 3. The original input color video sequences in
INPUT VIDEO
SEQUENCES
4. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 34
QCIF, CIF and SIF formats are used in the simulations. The video sequences are fed into the
coder either in RGB, YUV or YCbCr color space first by temporal filtering without motion
compensation and followed by spatial filtering. The color-separator signal is transformed
into frequency space yielding a multi-resolution representation or wavelet tree of image
with different levels of detail.
Separable 3D dyadic wavelet transformation from the four families of filter banks are used and
applied to each of the luminance and chrominance components of the video sequences frame by
frame. The boundary extensions can be applied here from the proposed boundary treatment
strategies. The appropriate level of decomposition depth or penetration depth can be applied to
the compression scheme employing either global or level-dependent thresholds. The compressed
video sequences of every color components are reconstructed and the objective evaluation of
MSE, MAE, PSNR, Bit- rate and compression ratio are then calculated.
The input color sequences are first temporally decomposed into Low-Pass-Temporal (LPT) and
High-Pass-Temporal (HPT) sub bands. For example for YUV color space of the input video
sequences received, for SIF resolution of 352x240 pixels, the corresponding functions to read the
.sif video yielding individual Y,U and V image components using level dependent threshold, each
of the Y,U,V components are fed into the function of “comp_3dy_qsif_lvd.m” .
[Y,U,V]=read_sif('football',i).
By initializing A = zeros(size(352,240));
B = Y; % B is also assigned to values of U and V for the chrominance components
L_B = plus(A,B)/2; % For average temporal subbands of the same size
H_B = minus(A,B); % For difference temporal subbands of the same size
The various related information and explanation steps used in the proposed algorithm are
explained in the proceeding sections.
3.1 Input Sequences
The fundamental difficulty in testing an input image and video compression system is how to
decide which test sequences to use for the evaluations. Monochrome, color images and color
video sequences are to be selected and evaluated.
A) Monochrome Images
A digital grayscale image is typically represented by 8 bits per pixel (bpp) in its uncompressed
form. Each pixel has a value ranging from 0 (black) to 255 (white). Transform methods are
applied directly to a two dimensional image by first operating on the rows, and then on the
columns. Transforms that can be implemented in this way are called separable. Figure 4: (a)
shows the original Lena image which is then wavelet transformed to the fifth decomposition
levels, as in (b). The corresponding histogram plot and the frequency response are as in Figure 4:
(c) and (d) respectively.
5. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 35
Reconstructed Video
Sequences
FIGURE 3: Proposed 3D Wavelet Video Compression
B) Color Images
A digital color image is stored as a three-dimensional array and uses 24 bits to represent each
pixel in its uncompressed form. Each pixel contains a value representing a red (R), green (G),
and blue (B) component scaled between 0 and 255–this format is known as the RGB format.
Image compression schemes first convert the color image from the RGB format to another color
space representation that separates the image information better than RGB. In this thesis the
color images are converted to the luminance (Y), chrominance-blue (Cb), and chrominance-red
Input Video Sequences
(QCIF, CIF, SIF)
Applying 3-DWT
Low-Pass
Temporal
Frame
Temporal Filtering
(Average & Difference)
High-Pass
Temporal
Frame
Parameter Setting:
Appropriate Filter-Bank
Decomposition Level
Quantization Threshold
To the compression system
Parameter Setting:
Appropriate Filter-Bank
Decomposition Level
Quantization Threshold
To the compression system
* Separable Wavelet
Transform (Dyadic)
* Wavelet Packet
Objective Evaluation
• Compression Ratio
• Bit-rate
• MSE, MAE, PSNR
Boundary Treatment
Color Space
Conversion
6. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 36
(Cr) color space. The luminance component represents the intensity of the image and looks like a
grey scale version of the image. The chrominance-blue and chrominance-red components
represent the color information in the image. The Y, Cb, and Cr components are derived from the
RGB space.
a b
C d
FIGURE 4: (a) The original Lena Image, (b) Level 5 decomposition of Lena image, (c)
Histogram plot of the original Lena image, (d) Frequency response after level
5 decomposition of Lena image.
C) Source Picture Formats
To implement the standard, it is very important to know the picture formats that the standard
supports and positions of the samples in the picture. Table 2 shows the different kinds of motion
of QCIF video sequences. The samples are also referred to as pixels (picture elements) or pels.
Source picture formats are defined in terms of the number of pixels per line, the number of lines
per picture, and the pixel aspect ratio. H.263 allows for the use of five standardized picture
formats.
These are the CIF (common intermediate format), QCIF (quarter-CIF), sub-QCIF, 4CIF, and
16CIF. Besides these standardized formats, H.263 allows support for custom picture formats that
can be negotiated. Details of the five standardized picture formats are summarized in Table 3.
7. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 37
Since human eyes are less sensitive to the chrominance components, these components
typically have only half the resolution, both horizontally and vertically, of the Y components,
hence the term “4:2:0 format.”
No Video Sequences in QCIF
Formats
Kinds of Motion
1 Carphone Fast object translation
2 Claire Slow object translation
3 Foreman Object translation and panning. Medium spatial
detail and low amount of motion or vice versa.
4 News Medium spatial detail and low amount of motion or
vice versa.
5 Akiyo Low spatial detail and low amount of motion
6 Mother &Daughter Low spatial detail and low amount of motion
7 Grandmother Low spatial detail and low amount of motion
8 Salesman Low spatial detail and low amount of motion
9 Suzie Low spatial detail and low amount of motion
10 Miss America Low spatial detail and low amount of motion
TABLE 2: Different Kinds of Motion of QCIF Video Sequences
Format
Property Sub_QCIF QCIF CIF 4CIF 16CIF
Number of pixels per line
Number of lines
Uncompressed bit rate
(at 30 Hz), Mbit/s
128
96
4.4
176
144
9.1
352
288
37
704
576
146
1408
1152
584
TABLE 3: Standard Picture Format Supported by H.263
D) Original Sequences
Original test video sequences of Quarter Common Intermediate Format (QCIF) with each frame
containing 144 lines and 176 pixels per line, Common Interface Format (CIF) with each frame
containing 288 lines and 352 pixels per line and SIF with each frame containing 240 lines and
352 pixels per line. The QCIF test video sequences typically have different kinds of motion, such
as fast object translation, for the case of Carphone. Foreman has slow object translation and
panning. Claire sequence shows slow object translation and low motion activity. Suzie and Miss
America show stationary, small displacement and slow motion.
The original test sequences in CIF sized progressive digital sequences, are originally stored in
YUV format or YCrCb, with the U or Cr and V or Cb components sub-sampled 2:1 in both
horizontal and vertical directions.
E) Color Space Conversion
The video sequences of QCIF, CIF and SIF sizes are converted into individual YCbCr format so
that the data is represented in a form more suitable for compression, and wavelet decomposition
8. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 38
of each color component. In order to eliminate spectral redundancies, color space is changed
from RGB to YCbCr as the first step of compression. Changing the color space does not
introduce any error. The following equations transform RGB components into YCbCr of ICT
(Irreversible component transformation, used for lossy image compression):
Y
0.299 0.587 0.114
R
Cb = -0.16875 -0.33126 0.5
G
Cr 0.5 -0.41869 -0.08131
B
TABLE 4: Matrix of RGB to YCbCr Color Conversion
The first component Y, or luminance represents the intensity of the image. Cb and Cr are the
chrominance components and specify the blueness and redness of image respectively. Figure 5,
Figure 6, and Figure 7 shows the Akiyo, News and Stefan video image respectively, in the YCbCr
color space and in each of the three components. This illustrates the advantage of using the
YCbCr color space–most of the information is contained in the luminance. Each of the three
components (Y, Cb, and Cr) is input to the coder. The PSNR is measured for each compressed
component (Yout, Cbout, and Crout) just as for grayscale images. The three output components
are reassembled to form a reconstructed 24-bit color image.
As can be seen from the pictures of figures, the Y-component contributes the most information to
the image, as compared to the other two components of Cb and Cr. This makes it possible to get
greater compression by including more data from the Y-component than from the Cb and Cr
components.
(a) Original Akiyo (b) Y-component
(c) Cb Component (d) Cr Component
FIGURE 5: (a) The Original, (b) Y, (c) Cb and (d) Cr components for the Akiyo QCIF Sequence
(a) Original News (b) Y-component
9. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 39
(c) Cb Component (d) Cr Component
FIGURE 6: a) The Original, (b) Y, (c) Cb and (d) Cr components for the News CIF Sequence
(a) Original Stefan (b) Y-component
(c) Cb Component (d) Cr Component
FIGURE 7: a) The Original, (b) Y, (c) Cb and (d) Cr components for the the Stefan SIF Sequence
10. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 40
3.2 Wavelet Transformation
Video compression techniques, which only remove spatial redundancy, cannot be highly
effective. To achieve higher compression ratios the similarity of successive video frames has to
be exploited.
A) Temporal Compression
The color source video sequences in QCIF, SIF and CIF formats are used producing sequences
in Y, U and V color components as the input to the compression scheme. The average and
difference between frames producing two temporal low-pass and high-pass sub-bands are
calculated. For example the average and difference frame and the corresponding histogram plot
for the 1st
. frame of Akiyo video sequence is given as in Figure 8.
Average Frame Hist(L_By)
Difference Frame Hist(H_By)
FIGURE 8: Wavelet decomposition of Average and Difference frame and the corresponding
histogram plot for the 1st
. frame of Akiyo video sequence
B) Spatial Compression
For the application of the wavelet transform to images, cascaded 1D filters are used. The 1D
discrete wavelet transformation step is calculated by using Mallat's pyramid algorithm [23]. First,
the forward transformation combined with a down-sampling process of the image is performed,
once in the horizontal and twice in the vertical direction. This procedure produces one low- and
three high pass components. This procedure is applied recursively to the low pass output until the
resulting low pass component reaches a size small enough to achieve effective compression of
the image. In general, additional levels of transformation result in a higher compression ratio.
Spatial compression attempts to eliminate as much redundancy from single video frames as
possible without introducing degradation of quality. This is done by first transforming the image
from spatial to frequency domain and secondly by applying quantizing threshold the transformed
coefficients. A two dimensional discrete wavelet transformation (DWT) is applied spatially to the
images producing one level of decomposition into LL, LH, HL and HH sub-bands as in Figure 9.
Figure 10 shows the High-pass and Low pass sub-bands with the low-pass sub-bands further
decomposed and iterated. Figure 11 further illustrated the decomposition steps up to level three.
11. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 41
Both the Low-Pass-temporal and the High-Pass-Temporal sequences are shown to be spatially
decomposed into a number of sub bands as in Figure 12.
FIGURE 9: Level one of 2-D DWT applied on an image
Highpass
f
Lowpass
FIGURE 10: Level one of 2-D DWT of Highpas and Lowpass
1-level 2-level 3-level
FIGURE 11: Level Three Dyadic DWT scheme used for Image Compression
IMAGE
LL HL
HHLH
G
H
H
G
H
G
IMAGE
LL HL
HHLH LH HH
HL
LH HH
HL
HL
12. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 42
FIGURE 12: High-Pass Temporal and Low-Pass Temporal for Video sequences Compression
Given a signal s of length N, the DWT consists of N2log stages. The first step produces,
starting from s, two sets of coefficients: approximation coefficients 1cA and detail coefficients
1cD . These vectors are obtained by convolving s with the low pass filter LoF_D for
approximation, and with the high-pass filter HiF_D for detail, followed by dyadic decimation. For
images, an algorithm similar to the one-dimensional case is possible for two-dimensional
wavelets and scaling functions obtained from one-dimensional wavelets by tensor product. The
two-dimensional DWT leads to a decomposition of approximation coefficients at level j in four
components: the approximation at level 1+j and the details in three orientations (horizontal,
vertical and diagonal).
The multilevel 2-D decomposition in the MATLAB Environment, Wavelet Toolbox is a two-
dimensional wavelet analysis function, of [c,s] = wavedec2(x,n,’wname’), which returns the
wavelet decomposition of the input image of matrix x, at n decomposition levels, using the
wavelet named in the string ‘wname’. The output wavelet 2-D decomposition structure [C,S]
contains the wavelet decomposition vector C and the corresponding book-keeping matrix S.
Vector C is organized as: C = [ A(N) | H(N) | V(N) | D(N) | ... H(N-1) | V(N-1) | D(N-1) | ... |
H(1) | V(1) | D(1) ], where A, H, V, D, are row vectors such that: A = approximation coefficients, H
= horizontal detail coefficients, V = vertical detail coefficients, D = diagonal detail coefficients,
each vector is the vector column-wise storage of a matrix. Matrix S is such that: S(1,:) = size of
approximation coefficient(N), S(i,:) = size of detail coefficient(N-i+2) for i = 2,...,N+1 and S(N+2,:)
= size(X).
C) The choice wavelet filter banks
The choice of wavelet basis for video compression was based on reconstruction properties and
runtime complexity [16]. Generally, complexity for wavelet filter is )(nO , where n is the number
of filter taps. The one-dimensional n-tap filter pair is applied as follows:
∑
=
=
+=
1
0
)(2 0
~n
i
iikik xLl and ∑
=
=
+=
1
0
)(2 0
~n
i
iikik xHh
LPT
HPT
13. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 43
L
~
and H
~
are the low- and high-pass filters, x the pixel values with row- or column-index i , and
k is the index of filter output. Iterating with step k2 automatically introduces the desired down-
sampling by 2. Filter coefficients are real numbers in the range [-1,1].
Much research effort has been expended in the area of wavelet compression, with the results
indicating that wavelet approaches outperform DCT-based techniques [3], [4], [21], [25].
However, it is not completely clear which wavelets are suitable for video compression. Wavelets
implemented using linear-phase filters are generally advantageous for image processing because
such filters preserve the location of spatial details. A key component of an efficient video coding
algorithm is motion compensation. However at the present time it is too computationally intensive
to be used in software video compression. A number of low end applications therefore use
motion-JPEG, in essence frame by frame transmission of JPEG images [29] with no removal of
inter-frame redundancy.
The choice of filter bank in wavelet image and video compression is a crucial issue that affects
both image and video quality and compression ratio. A series of bi-orthogonal, and orthogonal
wavelet filters of differing length were evaluated by compressing and decompressing a number of
standard video test sequences, using different quantization thresholds. In this section, the
selection of wavelet filter-banks of QCIF, CIF and SIF video sequences and their implications for
the decoded image are discussed. A pool of color video sequences has been wavelet -
transformed with different settings of the wavelet filter bank, boundary selection, quantization
threshold and decomposition method. The reconstructed video sequences of QCIF sizes are
evaluated using an objective quality of peak signal to noise ratio (PSNR).
This section investigates how wavelet filter banks affect the subsequent quality and size of the
reconstructed data, using a wavelet based video codec developed. Test video sequences were
compressed with the codec, and the results obtained indicate that the choice of wavelet greatly
influences the quality of the compressed data and its size.
D) The Wavelet Filter-Banks
The DWT is implemented using a two-channel perfect reconstruction linear phase filter bank [26].
Symmetric extension techniques are used to apply the filters near the frame boundaries; an
approach that allows transforming images with arbitrary dimensions.
14. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 44
3.3 Wavelet Threshold Selection
This section describes wavelet thresholding for image compression under the framework
provided by Statistical Learning Theory aka Vapnik-Chervonenkis (VC) theory. Under the
framework of VC-theory, wavelet thresholding amounts to ordering of wavelet coefficients
according to their relevance to accurate function estimation, followed by discarding insignificant
coefficients. Existing wavelet thresholding methods specify an ordering based on the coefficient
magnitude, and use threshold(s) derived under gaussian noise assumption and asymptotic
settings. In contrast, the proposed approach uses orderings better reflecting statistical properties
of natural images, and VC-based thresholding developed for finite sample settings under very
general noise assumptions.
The plot of wavelet coefficients in Figure 4: (d) Frequency response after level 5 decomposition of
Lena image in wavelet domain, suggests that small coefficients are dominated by noise, while
coefficients with a large absolute value carry more signal information than noise. Stated more
precisely, the motivation to this thresholding idea based on the following assumptions:
• The de-correlating property of a wavelet transform creates a sparce signal: most
untouched coefficients are zero or close to zero.
• The noise level is not too high so that the signal wavelet coefficients can be distinguished
from the noisy ones.
As it turns out, this method is indeed effective and thresholding is a simple and efficient method
for noise reduction.
A) Global thresholding
Wavelet thresholding for image denoising involves taking the wavelet transform of an image (i.e.,
calculating the wavelet coefficients discarding setting to zero) the coefficients with relatively small
or insignificant magnitudes. By discarding small coefficients one actually discard wavelet basis
functions which have coefficients below a certain threshold. The denoised signal is obtained via
inverse wavelet transform of the kept coefficients. One global threshold derived by Donoho [13],
[14], [15] under gaussian noise assumption. Clearly, wavelet thresholding can be viewed a
special case of signal/data estimation from noisy samples, which can be addressed within the
framework of VC-theory. The original wavelet thresholding technique is equivalent to specifying a
structure that uses only a magnitude ordering of the wavelet coefficients. Obviously, this is not the
best way of ordering the coefficients.
B) Level dependent threshold
Level-dependent thresholding has been proposed to improve the performance of wavelet
thresholding method. Instead of using a global threshold, level-dependent thresholding uses a
group of thresholds, one for each scale level. It can be interpreted as the ordering of the wavelet
coefficients with respect to their magnitudes adjusted by scale level.
This suggests that the level-dependent thresholding be viewed as a special case of more
sophisticated importance ordering in model selection based denoising method. A number of
different structures (ordering schemes) can be specified on the same set of basis functions. A
good ordering should reflect the prior knowledge about the signal/data being estimated. Similarly,
2-D image signal estimation with VC approach may require more complicated ordering scheme.
4. PERFORMANCE MEASURES
Although there are several metrics that tend to be indicative of image quality, each of them has
situations in which it fails to coincide with an observer's opinion [26]. However, since running
human trials is generally prohibitively expensive, a number of metrics are often computed to help
judge image quality. Some of the more commonly used "quality" metrics are given below,
),( nmx stands for the original data sized M by N, and ),( nmx
)
the reconstructed mage.
15. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 45
• MAE (Mean Absolute Error)
MAE: One quantity, often computed in conjunction with other metrics, is maximum absolute error.
Since this metrics measure error, it is regarded as inversely proportional to image quality.
MAE = Max | |),(),( nmxnmx
)
− (1)
• MSE (Mean Square Error)
MSE, RMS: Two other quantities that appear frequently when comparing original and
reconstructed or approximated data are mean square error, and root mean square. These metrics
attempt to measure an inverse to image quality.
MSE = [ ]∑∑
−
=
−
=
−
1
0
2
1
0
|),(),(
.
1 M
j
N
i
nmxnmx
MN
)
(2)
RMS = MSE (3)
• SNR: Signal to noise ratio
SNR = 10 10log
[ ]
−∑∑
∑∑
−
=
=
=
−
=
−
=
1
1
1
0
2
1
0
2
1
0
),(),(
),(
N
i
M
j
M
j
N
i
nmxnmx
nmx
)
(4)
• Peak Signal-to-Noise Ratio (PSNR)
Peak signal-to-noise ratio (PSNR) is the standard method for quantitatively comparing a
compressed image with the original. For an 8-bit grayscale image, the peak signal value is 255.
Hence the PSNR of an M×N 8-bit grayscale image x and its reconstruction ˆx is calculated.
PSNR: Peak signal-to-noise ratio. The MSE and PSNR are directly related, and one normally
uses PSNR to measure the coder's objective performance.
PSNR = 10 10log
MSE
2
255
(5)
At high rate, images with PSNR above 32 dB are considered to be perceptually lossless. At
medium and low rates, the PSNR does not agree with the quality of the image. For color images,
the reconstruction of all three color spaces must be considered in the PSNR calculation. The MSE
is calculated for the reconstruction of each color space. The average of these three MSEs is used
to generate the PSNR of the reconstructed RGB image (as compared to the original 24-bit RGB
image).
PSNR = 10 10log
RGBMSE
2
255
(6)
Where RGBMSE is:-
RGBMSE = 3
gluegreenred MSEMSEMSE ++
(7)
16. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 46
• Compression ratio (CR)
Compression ratio is the relation between the amount of data of the original signal compared to
the amount of data of the encoded signal [8]:
=CR
)(
)(
encodeddataofAmount
signaloriginaldataofAmount
(8)
5. RESULTS OF DECOMPOSITION STRATEGY
To evaluate the decomposition strategy, three types of video sequences in QCIF resolutions with
varying types of motion such as Miss America, Foreman and Carphone are used in the
simulation. To compare the performance of the decomposition strategy between the dyadic
discrete wavelet transform (DWT) and wavelet packet (WP), an empirical evaluation is conducted
using three test video sequences of Miss America, Foreman and Carphone. Nine types of
wavelet filter-banks s are used for the evaluation. They are Bior-22, Bior-2.6, Bior-4.4, Bior-6.8,
Coif-2, Coif-3, Sym-4, Sym-5 and Sym-7. Global threshold is used in this evaluation, with values
of threshold (thr) ranging from 10 to 125. For every filter-bank used for decoding the three video
sequences, the average PSNR values for both the DWT and WP are calculated and tabulated as
in Table 5 to Table 10. The corresponding plots of PSNR values Vs the filter-banks used are as in
Figure 13 to Figure 18.
Thr=10
Miss America Foreman Carphone Average
PSNR [dB]
Filter
Name
DWT WP DWT WP DWT WP DWT WP
Bior-2.2
Bior-2.6
Bior-4.4
Bior-6.8
Coif-2
Coif-3
Sym-4
Sym-5
Sym-7
42.09
42.40
41.80
41.96
41.97
41.89
41.98
42.03
41.86
40.90
40.88
40.73
40.86
30.33
40.06
30.02
41.86
39.96
38.16
38.34
38.28
38.33
38.42
38.39
38.41
38.39
38.34
32.85
32.78
33.38
33.33
21.99
30.18
22.36
36.10
28.47
39.99
40.18
39.71
39.63
39.87
39.74
39.88
39.85
39.62
33.77
33.18
34.37
34.07
22.30
31.47
21.34
37.86
29.90
40.08
40.31
39.93
39.98
40.08
40.01
40.09
40.09
39.94
35.84
35.61
36.16
36.09
24.88
33.90
24.57
38.60
32.78
TABLE 5: Average in PSNR for Wavelet Packet and Discrete Wavelet Transform of three video
sequences of Miss America, Foreman, and Carphone , using the best nine types
filters and Global threshold of 10
17. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 47
FIGURE 13: The Plot for Average PSNR for Wavelet Packet and Discrete Wavelet Transform of
using the best nine types of wavelet filters and Global threshold of 10
Thr=20
Miss America Foreman Carphone Average
PSNR [dB]
Filter Name
DWT WP DWT WP DWT WP DWT WP
bior-2.2
bior-2.6
bior-4.4
bior-6.8
coif-2
coif-3
sym-4
sym-5
sym-7
38.32
38.69
37.74
38.02
38.09
38.06
37.97
38.04
37.94
37.99
38.17
37.47
37.73
30.60
37.59
30.77
38.09
37.09
33.45
33.69
33.32
33.47
33.46
33.43
33.48
33.44
38.34
30.98
31.08
31.54
31.60
23.03
29.94
23.61
32.88
29.59
35.32
35.47
34.85
34.88
34.97
34.86
35.00
34.76
34.61
32.28
32.07
32.78
32.73
22.61
31.28
22.79
34.31
30.11
35.70
35.95
35.30
35.45
35.51
35.45
35.48
35.42
36.96
33.75
33.77
33.93
34.02
25.42
32.94
25.72
35.09
32.27
Table 6: Average in PSNR for Wavelet Packet and Discrete Wavelet Transform of three video
sequences of Miss America, Foreman, and Carphone, using the best nine types
filters and Global threshold of 20
18. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 48
FIGURE 14: The Plot for Average PSNR for Wavelet Packet and Discrete Wavelet Transform of
using the best nine types of wavelet filters and Global threshold of 20
Thr=45
Miss America Foreman Carphone Average
PSNR [dB]
Filter
Name
DWT WP DWT WP DWT WP DWT WP
bior-2.2
bior-2.6
bior-4.4
bior-6.8
coif-2
coif-3
sym-4
sym-5
sym-7
34.70
35.06
33.92
34.23
34.24
34.22
34.06
33.90
34.07
34.34
34.69
33.86
34.18
31.72
34.04
31.75
33.97
33.94
29.09
29.37
28.43
28.66
28.50
28.48
28.44
28.51
28.45
28.07
28.26
28.01
28.32
24.06
27.30
25.08
28.67
28.69
29.67
30.02
29.20
29.28
29.38
29.10
29.34
29.44
29.10
28.99
28.99
29.14
29.22
27.83
29.18
26.99
29.76
28.70
31.15
31.48
30.52
30.72
30.70
30.60
30.61
30.62
30.54
30.47
30.64
30.34
30.57
27.87
30.17
27.94
30.80
30.44
TABLE 7: Average in PSNR for Wavelet Packet and Discrete Wavelet Transform of three video
sequences of Miss America, Foreman, and Carphone , using the best nine types
filters and Global threshold of 45
19. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 49
FIGURE 15: The Plot for Average PSNR for Wavelet Packet and Discrete Wavelet Transform of
using the best nine types of wavelet filters and Global threshold of 45
Thr=85
Miss America Foreman Carphone Average
PSNR [dB]
Filter
Name
DWT WP DWT WP DWT WP DWT WP
bior-2.2
bior-2.6
bior-4.4
bior-6.8
coif-2
coif-3
sym-4
sym-5
sym-7
32.10
32.46
31.41
31.76
31.51
31.45
31.60
31.65
31.35
32.09
32.44
31.40
31.74
31.37
31.54
31.30
31.68
31.44
26.35
26.73
25.56
25.91
25.75
25.80
25.58
25.81
25.82
25.83
26.04
25.41
25.75
23.00
24.99
25.05
25.77
25.84
26.58
26.99
25.93
26.20
26.26
26.11
26.08
26.27
26.22
26.48
26.08
26.09
26.36
23.79
25.77
25.37
26.41
25.82
28.34
28.72
27.63
27.96
27.84
27.79
27.75
27.91
27.80
28.13
28.19
27.64
27.95
26.05
27.43
27.24
27.95
27.70
TABLE 8: Average in PSNR for Wavelet Packet and Discrete Wavelet Transform of three video
sequences of Miss America, Foreman, and Carphone, using the best nine types
filters and Global threshold of 85
20. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 50
FIGURE 16: The Plot for Average PSNR for Wavelet Packet and Discrete Wavelet Transform of
using the best nine types of wavelet filters and Global threshold of 85
Thr=125
Miss America Foreman Carphone Average
PSNR [dB]
Filter
Name
DWT WP DWT WP DWT WP DWT WP
bior-2.2
bior-2.6
bior-4.4
bior-6.8
coif-2
coif-3
sym-4
sym-5
sym-7
30.95
31.34
30.15
30.40
30.31
30.36
30.34
30.62
30.40
30.85
31.30
30.38
30.65
30.29
30.40
30.40
30.69
30.47
24.71
25.04
23.86
24.17
24.15
24.19
24.08
24.04
24.07
24.48
24.83
23.87
24.20
22.01
23.89
24.09
24.26
24.34
24.58
24.91
23.91
24.13
24.42
24.27
24.18
24.55
24.39
24.48
24.82
24.26
24.27
23.05
24.54
24.56
24.72
24.70
26.75
27.10
25.98
26.24
26.30
26.27
26.20
26.40
26.29
26.60
26.98
26.17
26.37
25.12
26.28
26.35
26.56
26.50
TABLE 9: Average in PSNR for Wavelet Packet and Discrete Wavelet Transform of three video
sequences of Miss America, Foreman and Carphone, using the best nine types of
filters and Global threshold of 125
21. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 51
FIGURE 17: The Plot for Average PSNR for Wavelet Packet and Discrete Wavelet Transform of
using the best nine types of wavelet filters and Global threshold of 125
No. Video Sequences (WP) dB (DWT) dB Difference
(WP-DWT) - dB
1 Miss America 38.09 38.03 0.05
2 Suzie 34.59 34.58 0.01
3 Claire 37.21 37.26 -0.05
4 Mother & Daughter 33.77 33.69 0.07
5 Grandma 33.86 33.80 0.05
6 Carphone 34.88 34.53 0.34
7 Foreman 33.22 33.02 0.19
8 Salesman 32.38 32.19 0.18
TABLE 10: Average and difference in PSNR for Wavelet Packet and Discrete Wavelet Transform
of eight video sequences, using Sym5 Filter for Low-Pass-Temporal and Haar Filter
for High-Pass-Temporal Frequencies, at two Decomposition Levels, and Global
threshold of 20
6. DISCUSSION
The choice of decomposition strategy between dyadic transformation of Discrete Wavelet
Transform (DWT) and Wavelet Packet (WP) Transform has been examined and found that DWT
decomposition is preferred over the WP due to the superior values of PSNR generated
throughout the simulations for all the lower values of global threshold from thr=10 and thr=20. For
thr=45, wavelet filter of sym-5 resulted the WP outperforms DWT, and further for thr=85 and
thr=125. At thr=85, bior-4.4 and sym-5 resulted the WP to outperform DWT. Only for higher value
22. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 52
of global threshold of 125, the WP shows better PSNR values compared to DWT. However for
the results of Table 10, the average difference of WP and DWT is only marginal when global
threshold value of 20 and wavelet filter bank of sym-5 is used.
7. CONCLUSION
This paper has addressed a number of challenges identified in 3D video compression based on
wavelet transformation technique. The challenges appear when having to develop the coding
scheme of 3D wavelet video coding, the question arises whether to transform the inter-frame
followed by spatial filtering or vice versa. The substantial contribution on 3D wavelet coding is
utilizing the spatio-temporal t-2D scheme, which has successfully exploited the wavelet transform
technology and the best parameter strategies produces video output of high performance
evaluated objectively using PSNR. It has shown that the selection of the parameters within the
wavelet transform of periodic symmetric extension as the border distortion strategy, dyadic DWT,
and level dependent thresholds have resulted to produce resulted in superior quality of the
decoded video sequences. These parameters has been identified and further used in the
proposed wavelet video compression (WVC) for the overall objective performance of the decoded
video sequences. The spatio-temporal inter-frame temporal analysis of video sequences has
been extended using Birge-Massaart strategy of wavelet shrinkage employing level-dependent
quantization threshold. The objective evaluation pointed out that the PSNR values correlates
better to different types of motion of the test video sequences especially to the Carphone
sequence and using newly found Sym-5 filter-bank.
8. REFERENCES
1. Adami, N., Michele, B., Leonardi, R., and Signoroni, A. “A fully scalable wavelet video coding
scheme with homologous inter-scale prediction”. ST Journal of Research, 3(2):19-35, 2006
2. Adelson, E. H., and Simoncelli, E. “Orthogonal pyramid transforms for image coding”. In
Proceedings of SPIE Visual Communications and Image Processing II, 845:50-58, 1987
3. Albanesi, M. G., Lotto, I., and Carrioli, L. “Image compression by the wavelet decomposition”.
European Transactions on Telecommunication, 3(3):265-274, 1992.
4. Antonini, M., Barlaud, M., Mathieu, P., and Daubechies, I. “Image coding using wavelet
transform”. IEEE Transactions on Image Processing, 1(2):205-220, 1992
5. Antonio, N. “Advances in Video Coding for hand-held device implementation in networked
electronic media”. Journal of Real-Time Image Processing, 1:9-23, 2006
6. Ashourian, M.; Yusof, Z.M.; Salleh, S.H.S.; Bakar, S.A.R.A. “Robust 3-D subband video
coder”. Sixth International,Symposium on Signal Processing and its Applications, Vol 2:549–
552, 2001
7. Brislawn, M. “Classification of non-expansive symmetric extension transforms for multirate
filter banks”. Applied and Computational Harmonic Analysis, 3(4): 337-357,1996.
8. Claudia, S. “Decomposition strategies for wavelet-based image coding”. IEEE International
Symposium on Signal Processing and its Applications (ISSPA), Vol. 2: 529-532, Kuala
Lumpur, Malaysia, 2001.
9. Daubechies, I. “Orthonormal bases of compactly supported wavelets". Commun. Pure
Applied. Math, 41:909-996, 1988.
10. Daubechies, I. “The wavelet transform, time-frequency localization and signal analysis”. IEEE
Trans. on Information Theory, 36(5):961-1005, 1990.
11. Daubechies, I. “Ten Lectures on Wavelets”. Philadelphia, Pennsylvania: Society for Industrial
and Applied Mathematics (SIAM), 1992
23. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 53
12. Daubechies, I. and Sweldens, W. “Factoring wavelet transforms into lifting steps”. Journal of
Fourier Analysis and Applications, 4(3):245-267, 1998.
13. Donoho, L. “De-noising by soft-thresholding”. IEEE Transactions on Information Theory,
41(3):613-627, 1995.
14. Donoho, L. and Johnstone, I. M. “Ideal spatial adaptation via wavelet shrinkage”. Biometrika,
81(3):425-455, 1994.
15. Donoho, L. and Johnstone, I. M. “Minimax estimation via wavelet shrinkage”. The Annals of
Statistics, 26(3):879-921, 1998.
16. George, F., Dasen, M., Weiler, N., Plattner, B., and Stiller, B. “The wavevideo system and
network architecture: design and implementation”. Technical report No. 44. Computer
Engineering & Networks Laboratory (TIK), E7H, Zurich, Switzerland, 1998.
17. Golwelkar, A. V. and Woods, J. W. “Scalable video compression using longer motion-
compensated temporal filters”. VCIP 2003: 1406-1416, 2003
18. Hsiang, S. T. and Woods, J. W. “Embedded video coding using invertible motion
compensated 3-D subband/wavelet filter bank”. Journal of Signal Processing: Image
Communication, Vol. 16: 705-724, 2001.
19. Karlsson, G. and Vetterli, M. “Three dimensional subband coding of video”.In Proceedings of
IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), II:1100-1103, 1988.
20. Lewis, A. S. and Knowles, G. “Video compression using 3D wavelet transforms”. Electronic
Letters, 26(6):396-398, 1990.
21. Lewis, A. S. and Knowles, G. “Image compression using the 2-D wavelet transform”. IEEE
Trans. Image Processing, 1:244-250, 1992.
22. Luo, J. “Low bit rater wavelet-based image and video compression with adaptive
quantization, coding and post processing”. Technical Report EE-95-21. The University of
Rochester, School of Engineering and Applied Science, Department of Electrical
Engineering, Rochester, New York, 1995.
23. Mallat, S. G. 1989. “A theory for multiresolution signal decomposition: the wavelet
representation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674-
693, 1989.
24. Podilchuk, Jayant, N. S., and Farvardin, N. “Three-dimensional subband coding of video”.
IEEE Translation of Image Processing, 4(2): 125-139, 1995.
25. Shapiro, J. M. “Embedded image coding using zerotrees of wavelets coefficients”. IEEE
Transactions on Signal Processing, 41(12):3445-3462, 1993.
26. Strang, G., and Nguyen, T. “Wavelets and Filter Banks”. Wellesley-Cambridge Press,
Wellesley, MA, USA, 1997.
27. Taubman, D., and Zakhor, A. “Multirate 3-D subband coding of video”. IEEE Transactions on
Image Processing, 3(5): 572-588, 1994
28. Vass, J., Zhuang, S., Yao, J., and Zhuang, X. “Mobile video communications in wireless
environments”. In Proceedings of IEEE Workshop on Multimedia Signal Processing,
Copenhagen, Denmark: 45-50, 1999
29. Wallace, G. K. “The JPEG still picture compression standard”. Comm. ACM, 34(4):30-44,
1994.
24. Rohmad Fakeh & Abdul Azim Abd Ghani
International Journal of Image Processing, Volume (3) : Issue (1) 54
30. Wang, X., and Blostern, S. D. 1995. “Three–dimensional subband video transmission through
mobile satellite channels”. In Proceedings of International Conference on Image Processing,
Vol. 3, pp. 384-387, 1995.