This document proposes a thesis project to develop a transcoder that can efficiently transcode a H.264 bitstream to a VC-1 bitstream. The objective is to implement a H.264 to VC-1 transcoder for progressive compression. Motivation for the project includes the coexistence of different video coding standards like H.264, VC-1, and MPEG-2 in applications such as Blu-ray discs, which creates a need for transcoding between these formats. While there has been work on other types of transcoding, published work on H.264 to VC-1 transcoding is limited. The document discusses various transcoding architectures and argues that a cascaded pixel domain architecture is best suited for the heterogeneous
Overview of the H.264/AVC video coding standard - Circuits ...Videoguy
The document provides an overview of the H.264/AVC video coding standard. Some key points:
- H.264/AVC aims to double the coding efficiency of prior standards like MPEG-2 and H.263 to allow higher quality video at lower bit rates.
- It achieves this through new coding tools like fractional pixel motion compensation, variable block-size motion compensation, intra prediction, and entropy coding.
- The standard defines the decoding process but provides flexibility in encoding implementations. It is intended for both conversational and non-conversational applications like video telephony, streaming, and storage.
Error resilient for multiview video transmissions with gop analysisijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS ijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences
The document discusses video compression standards for conferencing and internet video. It describes the components and evolution of standards including H.261, H.263, H.263+, MPEG-1, MPEG-2, and MPEG-4. It focuses on the basics of H.263 including its frame formats, picture and macroblock types, and motion vectors. It also explains the improvements of H.263+ over H.263 such as additional negotiable options.
Telcos are using IP networks and advanced video encoding like MPEG-4 Part 10 (H.264) to deliver video content over the internet via IPTV. H.264 provides much more bandwidth efficiency than MPEG-2. VoIP maps the encoded video onto the IP network for transmission. A typical implementation includes an H.264 encoder, VoIP to encapsulate the stream, and transmission over Ethernet. IPTV allows telcos to compete with cable companies by providing digital TV using internet infrastructure instead of traditional cable networks.
Current developments in video quality: From the emerging HEVC standard to tem...Harilaos Koumaras
This document discusses current developments in video quality and the emerging HEVC video coding standard. It provides an overview of HEVC, including its key features such as flexible block structures, larger transform units, and new intra-coding and inter-coding prediction methods. Experimental results show that HEVC can achieve a 32-62% improvement in compression ratio over H.264/AVC while maintaining the same video quality. The document also discusses advances in video quality prediction through enhanced content classification of uncompressed video and improved prediction of quality for compressed video.
Video coding is an essential component of video streaming, digital TV, video chat and many other technologies. This presentation, an invited lecture to the US Patent and Trade Mark Office, describes some of the key developments in the history of video coding.
Many of the components of present-day video codecs were originally developed before 1990. From 1990 onwards, developments in video coding were closely associated with industry standards such as MPEG-2, H.264 and H.265/HEVC.
The presentation covers:
- Basic concepts of video coding
- Fundamental inventions prior to 1990
- Industry standards from 1990 to 2014
- Video coding patents and patent pools.
Overview of the H.264/AVC video coding standard - Circuits ...Videoguy
The document provides an overview of the H.264/AVC video coding standard. Some key points:
- H.264/AVC aims to double the coding efficiency of prior standards like MPEG-2 and H.263 to allow higher quality video at lower bit rates.
- It achieves this through new coding tools like fractional pixel motion compensation, variable block-size motion compensation, intra prediction, and entropy coding.
- The standard defines the decoding process but provides flexibility in encoding implementations. It is intended for both conversational and non-conversational applications like video telephony, streaming, and storage.
Error resilient for multiview video transmissions with gop analysisijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS ijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences
The document discusses video compression standards for conferencing and internet video. It describes the components and evolution of standards including H.261, H.263, H.263+, MPEG-1, MPEG-2, and MPEG-4. It focuses on the basics of H.263 including its frame formats, picture and macroblock types, and motion vectors. It also explains the improvements of H.263+ over H.263 such as additional negotiable options.
Telcos are using IP networks and advanced video encoding like MPEG-4 Part 10 (H.264) to deliver video content over the internet via IPTV. H.264 provides much more bandwidth efficiency than MPEG-2. VoIP maps the encoded video onto the IP network for transmission. A typical implementation includes an H.264 encoder, VoIP to encapsulate the stream, and transmission over Ethernet. IPTV allows telcos to compete with cable companies by providing digital TV using internet infrastructure instead of traditional cable networks.
Current developments in video quality: From the emerging HEVC standard to tem...Harilaos Koumaras
This document discusses current developments in video quality and the emerging HEVC video coding standard. It provides an overview of HEVC, including its key features such as flexible block structures, larger transform units, and new intra-coding and inter-coding prediction methods. Experimental results show that HEVC can achieve a 32-62% improvement in compression ratio over H.264/AVC while maintaining the same video quality. The document also discusses advances in video quality prediction through enhanced content classification of uncompressed video and improved prediction of quality for compressed video.
Video coding is an essential component of video streaming, digital TV, video chat and many other technologies. This presentation, an invited lecture to the US Patent and Trade Mark Office, describes some of the key developments in the history of video coding.
Many of the components of present-day video codecs were originally developed before 1990. From 1990 onwards, developments in video coding were closely associated with industry standards such as MPEG-2, H.264 and H.265/HEVC.
The presentation covers:
- Basic concepts of video coding
- Fundamental inventions prior to 1990
- Industry standards from 1990 to 2014
- Video coding patents and patent pools.
This document discusses 3D video encoding and delivery standards for Android devices. It covers 3D video formats like side-by-side and top-bottom, support in H.264 profiles and HDMI standards, and how to configure the encoder on TI and Qualcomm processors to add 3D signaling information to the encoded video stream. By inserting frame packing and stereo metadata, devices can automatically detect 3D content and display it correctly without user intervention.
The document summarizes a project report on comparing the MPEG-2 and H.264 video coding standards, with a focus on their main profiles. It finds that while MPEG-2 is widely used in digital broadcasting and DVD applications, H.264 provides better compression performance. However, MPEG-2 and H.264 are incompatible, but this can be addressed through transcoding. The report discusses the MPEG-2 and H.264 standards in detail and compares their encoding schemes, profiles and levels before analyzing different transcoding methods.
The document discusses the H.264 video compression standard. It provides an overview of the standard, including its objectives to improve compression performance over previous standards. Key features that allow for superior compression compared to other standards are described, such as enhanced motion estimation and an improved deblocking filter. Performance comparisons show H.264 can provide bit rate savings of up to 50% compared to other standards like MPEG-2 and H.263.
The document provides an overview of the High Efficiency Video Coding (HEVC) standard. It was developed jointly by ISO/IEC and ITU-T to provide roughly half the bit-rate of H.264/AVC for the same subjective quality. Key aspects of HEVC include use of larger block sizes, intra-picture prediction with 33 directional modes, motion vectors with quarter-sample precision, transform sizes from 4x4 to 32x32, adaptive coefficient scanning, in-loop filtering including deblocking and sample adaptive offset, and support for lossless and transform skipping modes. Many companies are starting to support HEVC in their video products and services.
This document provides an overview of HEVC (High Efficiency Video Coding) including:
- HEVC aims to provide roughly half the bitrate of H.264/AVC at the same quality.
- It uses block-based hybrid video coding with improved intra-prediction, transform, quantization and entropy coding techniques.
- HEVC supports a wide range of resolutions, color spaces and bit depths for 4K and beyond.
1) The document discusses the high-level syntax of HEVC, including the video parameter set (VPS), sequence parameter set (SPS), and picture parameter set (PPS).
2) It describes the bitstream structure and how VPS, SPS, PPS, and slice data are organized in network abstraction layer (NAL) units.
3) Key coding units like coding tree blocks (CTBs), coding blocks (CBs), and coding units (CUs) are defined, as well as the quadtree partitioning syntax used in HEVC.
The document discusses post-processing deblocking filters used in video coding standards like H.264 and MPEG-2. It describes how blocking artifacts can occur during video compression due to quantization and motion compensation. It then explains that deblocking filters help reduce blocking artifacts by applying filtering to block boundaries in the decoded video. Specifically, it discusses the differences between post-processing and in-loop deblocking filters, and provides details on how deblocking is implemented in standards like H.263+, H.264, MPEG-2, and JPEG.
Spatial Scalable Video Compression Using H.264IOSR Journals
H.264 is a video compression standard that provides improved compression performance over prior standards like H.261 and H.263. It achieves spatial scalability by encoding video in a spatial manner that reduces the number of frames and file size. The paper simulates H.264 encoding and decoding of a QCIF video using JM software. It compares parameters like PSNR, CSNR, and MSE between the encoded and decoded video. H.264 provides 31-35% greater efficiency and lower bit rates compared to prior standards.
The document provides an overview of the emerging H.264 video coding standard and its implementation on the TMS320C64x digital media platform. It discusses key advantages of H.264 including up to 50% bit rate savings compared to other standards. It describes H.264 technical features such as various block sizes for motion estimation, high precision motion vectors, multiple reference frames, and de-blocking filters. Finally, it introduces UB Video's H.264 video processing solution UBLive-264-C64 optimized for the TMS320C64x DSP platform.
This document discusses a project that aims to capture real-time video frames using a webcam, compress the frames using the H.263 codec, transmit the encoded stream over Ethernet, decode it at the receiving end for display. It describes the tools, video compression and encoding process using H.263, packetization for transmission, decoding, and analysis of compression ratio and quality using PSNR.
Video coding standards define bitstream structures and decoding methods for video compression. Popular standards include MPEG-1/2/4 and H.264/HEVC developed by ISO/IEC and ITU-T. Standards are developed through identification of requirements, algorithm development, selection of core techniques, validation testing, and publication. They enable interoperability and future decoding of emerging standards. [/SUMMARY]
The document discusses the new HEVC/H.265 video compression standard and its benefits for ultra high definition video. Key points:
- HEVC is 50% more efficient than H.264/MPEG-4 AVC, allowing a 50% reduction in bandwidth. It can support resolutions up to 8K.
- Tests show HEVC achieves 50-75% lower bitrates than H.264 for ultra high definition video, while maintaining comparable quality.
- HEVC's increased efficiency comes from processing video in 64x64 pixel blocks rather than 16x16, and parallel processing of video frames. This requires powerful multi-core processors.
- The improved compression enables
Emerging H.264 Standard: Overview and TMS320DM642- Based ...Videoguy
The document provides an overview of the emerging H.264 video coding standard. It describes how H.264 aims to provide 50% bit rate savings over prior standards through features like multiple block sizes, higher resolution motion estimation, and improved entropy coding. H.264 supports intra-coding of blocks within a frame and inter-coding through motion compensation between frames. Smaller block sizes allow for improved motion modeling.
This document provides an overview and comparison of the H.264 and HEVC video coding standards. It describes the key features and innovations that allow each standard to compress video more efficiently than previous standards. H.264 introduced features like adaptive block sizes, multi-frame prediction, quarter-pixel motion compensation and loop filtering that improved compression performance over prior standards. HEVC aims to further increase compression efficiency through innovations such as larger coding tree blocks, additional intra-prediction modes, and improved entropy coding. The document analyzes these standards to understand how their new coding tools enable significantly higher compression ratios and support for new applications like higher resolution video.
Complexity Analysis in Scalable Video CodingWaqas Tariq
The scalable video coding is the extension of H.264/AVC. The features in scalable video coding, are the standard features in H.264/AVC and some features which is supporting the scalability of the encoder. Those features bring some more complexity in SVC encoder. In this paper, the evaluation of scalability and encoding time (complexity) has been performed. The encoder shows the scalability of the system and quality of the optimized scheme is acceptable.
The document compares video compression standards MPEG-4 and H.264. It discusses key aspects of each including profiles, levels, uses and future applications. MPEG-4 introduced object-based coding while H.264 provides around 50% better compression than MPEG-4 at similar quality levels. Both standards are widely used for video streaming, television broadcasting, and storage applications like Blu-ray discs. Ongoing development aims to improve support for high definition video formats.
H.261 is a video coding standard published in 1990 by ITU-T for videoconferencing over ISDN networks. It uses techniques like DCT, motion compensation, and entropy coding to achieve compression ratios over 100:1 for video calling. H.261 remains widely used in applications like Windows NetMeeting and video conferencing standards H.320, H.323, and H.324.
The document discusses MPEG-2 transport streams, which allow multiplexing of audio, video and other data into a single format suitable for transmission and storage. It describes the two multiplexing methods - program streams designed for error-free applications, and transport streams using fixed size packets for lossy applications. Transport streams carry multiple programs using packet identification and program mapping tables to associate elementary streams.
HEVC/H.265 is a video compression standard that provides around 50% better compression over H.264/AVC for the same level of video quality. It was finalized in 2013 by the joint collaboration of MPEG and ITU-T. Key features of HEVC include support for higher resolutions like 4K and 8K, improved parallel processing abilities, increased coding efficiency through larger block sizes and an expanded set of prediction modes.
The document provides an overview of the High Efficiency Video Coding (HEVC) standard. Some key points:
- HEVC was created as a new video compression standard to address the growing needs of higher resolution video content and more efficient compression compared to prior standards like H.264.
- It achieves 50% bitrate reduction over H.264 for the same visual quality or improved quality at the same bitrate.
- The standard uses a block-based coding structure with coding tree units and supports intra-frame and inter-frame coding with motion estimation/compensation.
- It introduces more intra-prediction modes and block sizes along with improved transforms, quantization, and entropy coding.
This document presents an algorithm called Fractional Fourier Transform (FXT) to remove spectral leakage caused by non-coherent sampling of sinewaves. The algorithm works by "twisting" the time/frequency space to accommodate fractional periods. It was shown through simulations and ADC testing to automatically correct for frequency drift, maintain spectral resolution, and conserve SNR. The FXT algorithm allows using non-coherent oscillators for testing applications like ADC or waveform recorders.
The document discusses the limitations of the Fourier transform for analyzing non-stationary signals and introduces the wavelet transform as a solution. Specifically, it notes that the Fourier transform only shows the frequencies present in a signal but not when they occur over time. In contrast, the wavelet transform provides time-frequency representation by decomposing a signal into scaled and translated versions of the original or "mother" wavelet. This time-frequency representation allows the wavelet transform to be useful for applications like image compression, signal de-noising, and edge and rupture detection.
This document discusses 3D video encoding and delivery standards for Android devices. It covers 3D video formats like side-by-side and top-bottom, support in H.264 profiles and HDMI standards, and how to configure the encoder on TI and Qualcomm processors to add 3D signaling information to the encoded video stream. By inserting frame packing and stereo metadata, devices can automatically detect 3D content and display it correctly without user intervention.
The document summarizes a project report on comparing the MPEG-2 and H.264 video coding standards, with a focus on their main profiles. It finds that while MPEG-2 is widely used in digital broadcasting and DVD applications, H.264 provides better compression performance. However, MPEG-2 and H.264 are incompatible, but this can be addressed through transcoding. The report discusses the MPEG-2 and H.264 standards in detail and compares their encoding schemes, profiles and levels before analyzing different transcoding methods.
The document discusses the H.264 video compression standard. It provides an overview of the standard, including its objectives to improve compression performance over previous standards. Key features that allow for superior compression compared to other standards are described, such as enhanced motion estimation and an improved deblocking filter. Performance comparisons show H.264 can provide bit rate savings of up to 50% compared to other standards like MPEG-2 and H.263.
The document provides an overview of the High Efficiency Video Coding (HEVC) standard. It was developed jointly by ISO/IEC and ITU-T to provide roughly half the bit-rate of H.264/AVC for the same subjective quality. Key aspects of HEVC include use of larger block sizes, intra-picture prediction with 33 directional modes, motion vectors with quarter-sample precision, transform sizes from 4x4 to 32x32, adaptive coefficient scanning, in-loop filtering including deblocking and sample adaptive offset, and support for lossless and transform skipping modes. Many companies are starting to support HEVC in their video products and services.
This document provides an overview of HEVC (High Efficiency Video Coding) including:
- HEVC aims to provide roughly half the bitrate of H.264/AVC at the same quality.
- It uses block-based hybrid video coding with improved intra-prediction, transform, quantization and entropy coding techniques.
- HEVC supports a wide range of resolutions, color spaces and bit depths for 4K and beyond.
1) The document discusses the high-level syntax of HEVC, including the video parameter set (VPS), sequence parameter set (SPS), and picture parameter set (PPS).
2) It describes the bitstream structure and how VPS, SPS, PPS, and slice data are organized in network abstraction layer (NAL) units.
3) Key coding units like coding tree blocks (CTBs), coding blocks (CBs), and coding units (CUs) are defined, as well as the quadtree partitioning syntax used in HEVC.
The document discusses post-processing deblocking filters used in video coding standards like H.264 and MPEG-2. It describes how blocking artifacts can occur during video compression due to quantization and motion compensation. It then explains that deblocking filters help reduce blocking artifacts by applying filtering to block boundaries in the decoded video. Specifically, it discusses the differences between post-processing and in-loop deblocking filters, and provides details on how deblocking is implemented in standards like H.263+, H.264, MPEG-2, and JPEG.
Spatial Scalable Video Compression Using H.264IOSR Journals
H.264 is a video compression standard that provides improved compression performance over prior standards like H.261 and H.263. It achieves spatial scalability by encoding video in a spatial manner that reduces the number of frames and file size. The paper simulates H.264 encoding and decoding of a QCIF video using JM software. It compares parameters like PSNR, CSNR, and MSE between the encoded and decoded video. H.264 provides 31-35% greater efficiency and lower bit rates compared to prior standards.
The document provides an overview of the emerging H.264 video coding standard and its implementation on the TMS320C64x digital media platform. It discusses key advantages of H.264 including up to 50% bit rate savings compared to other standards. It describes H.264 technical features such as various block sizes for motion estimation, high precision motion vectors, multiple reference frames, and de-blocking filters. Finally, it introduces UB Video's H.264 video processing solution UBLive-264-C64 optimized for the TMS320C64x DSP platform.
This document discusses a project that aims to capture real-time video frames using a webcam, compress the frames using the H.263 codec, transmit the encoded stream over Ethernet, decode it at the receiving end for display. It describes the tools, video compression and encoding process using H.263, packetization for transmission, decoding, and analysis of compression ratio and quality using PSNR.
Video coding standards define bitstream structures and decoding methods for video compression. Popular standards include MPEG-1/2/4 and H.264/HEVC developed by ISO/IEC and ITU-T. Standards are developed through identification of requirements, algorithm development, selection of core techniques, validation testing, and publication. They enable interoperability and future decoding of emerging standards. [/SUMMARY]
The document discusses the new HEVC/H.265 video compression standard and its benefits for ultra high definition video. Key points:
- HEVC is 50% more efficient than H.264/MPEG-4 AVC, allowing a 50% reduction in bandwidth. It can support resolutions up to 8K.
- Tests show HEVC achieves 50-75% lower bitrates than H.264 for ultra high definition video, while maintaining comparable quality.
- HEVC's increased efficiency comes from processing video in 64x64 pixel blocks rather than 16x16, and parallel processing of video frames. This requires powerful multi-core processors.
- The improved compression enables
Emerging H.264 Standard: Overview and TMS320DM642- Based ...Videoguy
The document provides an overview of the emerging H.264 video coding standard. It describes how H.264 aims to provide 50% bit rate savings over prior standards through features like multiple block sizes, higher resolution motion estimation, and improved entropy coding. H.264 supports intra-coding of blocks within a frame and inter-coding through motion compensation between frames. Smaller block sizes allow for improved motion modeling.
This document provides an overview and comparison of the H.264 and HEVC video coding standards. It describes the key features and innovations that allow each standard to compress video more efficiently than previous standards. H.264 introduced features like adaptive block sizes, multi-frame prediction, quarter-pixel motion compensation and loop filtering that improved compression performance over prior standards. HEVC aims to further increase compression efficiency through innovations such as larger coding tree blocks, additional intra-prediction modes, and improved entropy coding. The document analyzes these standards to understand how their new coding tools enable significantly higher compression ratios and support for new applications like higher resolution video.
Complexity Analysis in Scalable Video CodingWaqas Tariq
The scalable video coding is the extension of H.264/AVC. The features in scalable video coding, are the standard features in H.264/AVC and some features which is supporting the scalability of the encoder. Those features bring some more complexity in SVC encoder. In this paper, the evaluation of scalability and encoding time (complexity) has been performed. The encoder shows the scalability of the system and quality of the optimized scheme is acceptable.
The document compares video compression standards MPEG-4 and H.264. It discusses key aspects of each including profiles, levels, uses and future applications. MPEG-4 introduced object-based coding while H.264 provides around 50% better compression than MPEG-4 at similar quality levels. Both standards are widely used for video streaming, television broadcasting, and storage applications like Blu-ray discs. Ongoing development aims to improve support for high definition video formats.
H.261 is a video coding standard published in 1990 by ITU-T for videoconferencing over ISDN networks. It uses techniques like DCT, motion compensation, and entropy coding to achieve compression ratios over 100:1 for video calling. H.261 remains widely used in applications like Windows NetMeeting and video conferencing standards H.320, H.323, and H.324.
The document discusses MPEG-2 transport streams, which allow multiplexing of audio, video and other data into a single format suitable for transmission and storage. It describes the two multiplexing methods - program streams designed for error-free applications, and transport streams using fixed size packets for lossy applications. Transport streams carry multiple programs using packet identification and program mapping tables to associate elementary streams.
HEVC/H.265 is a video compression standard that provides around 50% better compression over H.264/AVC for the same level of video quality. It was finalized in 2013 by the joint collaboration of MPEG and ITU-T. Key features of HEVC include support for higher resolutions like 4K and 8K, improved parallel processing abilities, increased coding efficiency through larger block sizes and an expanded set of prediction modes.
The document provides an overview of the High Efficiency Video Coding (HEVC) standard. Some key points:
- HEVC was created as a new video compression standard to address the growing needs of higher resolution video content and more efficient compression compared to prior standards like H.264.
- It achieves 50% bitrate reduction over H.264 for the same visual quality or improved quality at the same bitrate.
- The standard uses a block-based coding structure with coding tree units and supports intra-frame and inter-frame coding with motion estimation/compensation.
- It introduces more intra-prediction modes and block sizes along with improved transforms, quantization, and entropy coding.
This document presents an algorithm called Fractional Fourier Transform (FXT) to remove spectral leakage caused by non-coherent sampling of sinewaves. The algorithm works by "twisting" the time/frequency space to accommodate fractional periods. It was shown through simulations and ADC testing to automatically correct for frequency drift, maintain spectral resolution, and conserve SNR. The FXT algorithm allows using non-coherent oscillators for testing applications like ADC or waveform recorders.
The document discusses the limitations of the Fourier transform for analyzing non-stationary signals and introduces the wavelet transform as a solution. Specifically, it notes that the Fourier transform only shows the frequencies present in a signal but not when they occur over time. In contrast, the wavelet transform provides time-frequency representation by decomposing a signal into scaled and translated versions of the original or "mother" wavelet. This time-frequency representation allows the wavelet transform to be useful for applications like image compression, signal de-noising, and edge and rupture detection.
Digital video watermarking scheme using discrete wavelet transform and standa...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
The document discusses video coding techniques for compression and transmission. It covers traditional hybrid video coding standards using motion compensation (H.261, H.263, MPEG), as well as newer techniques like wavelet video coding, error resilient transmission, rate-scalable coding, and distributed video coding without layers. These newer techniques can provide better rate-distortion performance than standard codecs or more graceful quality degradation over lossy networks.
The document discusses various techniques for image compression including:
- Run-length coding which encodes repeating pixel values and their lengths.
- Difference coding which encodes the differences between pixel values.
- Block truncation coding which divides images into blocks and assigns codewords.
- Predictive coding which predicts pixel values from neighbors and encodes differences.
Reversible compression allows exact reconstruction while lossy compression sacrifices some information for higher compression but images remain visually similar. Combining techniques can achieve even higher compression ratios.
The document introduces the limitations of Fourier transforms for analyzing non-stationary signals and discusses how wavelet transforms provide a solution. Fourier transforms can only provide frequency content over the entire signal duration and cannot localize this content in time. Wavelet transforms overcome this by analyzing the signal with translated and scaled versions of an analyzing wavelet, allowing time-frequency localization. This provides a time-frequency representation of non-stationary signals and indicates when different frequency components occur.
Wavelet analysis involves representing a signal as a sum of wavelet functions of varying location and scale. Wavelet transforms allow for efficient video compression by removing spatial and temporal redundancies. Without compression, transmitting uncompressed video would require huge storage and bandwidth. Using wavelet compression, a day of video could be stored using the same space as an uncompressed minute. The discrete wavelet transform decomposes a signal into different frequency subbands, making it suitable for scalable and tolerant video compression standards like JPEG2000. Wavelet compression provides better quality at low bit rates compared to DCT techniques like JPEG.
Image compression using discrete wavelet transformHarshal Ladhe
This document discusses image compression using the discrete wavelet transform (DWT) as outlined in the JPEG2000 standard. It presents the basic block diagram of image compression, including the encoder and decoder. It demonstrates color and gray-scale image compression across multiple levels of compression, showing the original and compressed images. It concludes that DWT provides high compression ratios while maintaining image quality and outperforms other traditional techniques. Future work is proposed to implement neural network-based compression.
The document discusses performing a discrete wavelet transform (DWT) on a 1D signal using MATLAB. It loads a test signal, performs a 5-level DWT decomposition using the coif3 wavelet, then reconstructs the approximation and detail signals at each level. Plots of the original, approximation, and detail signals are generated.
This document summarizes a student project on implementing lossless discrete wavelet transform (DWT) and inverse discrete wavelet transform (IDWT). It provides an overview of the project, which includes introducing DWT, reviewing literature on lifting schemes for faster DWT computation, and simulating a 2D (5,3) DWT. The results show DWT blocks decomposing signals into high and low pass coefficients. Applications mentioned are in medical imaging, signal denoising, data compression and image processing. The conclusion discusses the need for lossless transforms in medical imaging. Future work could extend this to higher level transforms and applications like compression and watermarking.
JPEG is a lossy image compression algorithm, not a file format. It uses a 4-step process to compress images: 1) transforming RGB to YCbCr color space, 2) applying a discrete cosine transformation to identify redundant data, 3) quantizing the remaining data, and 4) encoding the result to minimize storage requirements. Typical compression ratios are 10:1 to 20:1 without visible loss and up to 100:1 compression for low quality applications.
This document summarizes a presentation on wavelet based image compression. It begins with an introduction to image compression, describing why it is needed and common techniques like lossy and lossless compression. It then discusses wavelet transforms and how they are applied to image compression. Several research papers on wavelet compression techniques are reviewed and key advantages like higher compression ratios while maintaining image quality are highlighted. Applications of wavelet compression in areas like biomedicine and multimedia are presented before concluding with references.
This document discusses image compression techniques. It begins by explaining the goals of image compression which are to reduce storage requirements and increase transmission rates by reducing the amount of data needed to represent a digital image. It then describes lossless and lossy compression approaches, noting that lossy approaches allow for higher compression ratios but are not information preserving. The document goes on to explain various compression methods including transforms like DCT that reduce interpixel redundancy, quantization, entropy encoding, and standards like JPEG that use these techniques.
Image compression introductory presentationTariq Abbas
This document discusses image compression techniques. It explains that the goal of compression is to reduce the amount of data needed to represent a digital image by eliminating redundant information like coding, interpixel, and psychovisual redundancies. Compression can be lossy or lossless. Lossy methods allow for data loss but provide higher compression, while lossless preserves all image data. Common lossy techniques include JPEG, which uses discrete cosine transform and quantization, and lossless methods include run length and Huffman encoding.
This document provides an overview of common image compression techniques. It discusses why images are compressed, how compression reduces redundancy in images, and the general flow of image compression systems. It describes lossy compression methods like discrete cosine transform and discrete wavelet transform that reduce pixel correlation. Lossless compression techniques including quantization, Huffman coding, arithmetic coding and run length coding are also covered. Popular image compression algorithms like JPEG, JPEG 2000 and shape-adaptive compression are briefly outlined.
Presentation given in the Seminar of B.Tech 6th Semester during session 2009-10 By Paramjeet Singh Jamwal, Poonam Kanyal, Rittitka Mittal and Surabhi Tyagi.
This document summarizes spatial scalable video compression using H.264. It discusses previous video compression standards like H.261 and H.263. It then describes the key components of the H.264 encoder and decoder, including prediction models, spatial models and entropy encoding. Simulation results comparing parameters like PSNR, CSNR and MSE between encoded and decoded video using H.264 are presented. The paper concludes that H.264 provides 31-35% improved efficiency and bit rate reduction over previous standards.
The document summarizes the key features and tools of the H.264/AVC video coding standard. It describes how H.264/AVC achieves significant gains in compression efficiency of up to 50% compared to previous standards through the use of new tools like multiple reference frames, fractional pixel motion estimation, an adaptive deblocking filter, and an integer transform. It also notes that while the decoder complexity of H.264/AVC is higher than previous standards, the standard aims to provide efficient video compression for both interactive and non-interactive applications across different networks and storage media.
Introduction to Video Compression Techniques - Anurag JainVideoguy
The document provides an overview of video compression techniques and standards. It discusses the motivation for video compression to reduce data sizes for storage and transmission. It then reviews several key video compression standards including H.261, H.263, MPEG-1, MPEG-2, MPEG-4, H.264 and others. For each standard, it summarizes the goals, features, applications and technical details like motion compensation methods, block sizes, and bitrate ranges.
This document describes a project to design an H.264 video decoder using Verilog. It implements the key decoding blocks like Context-Based Adaptive Binary Arithmetic Coding (CABAC), inverse quantization, and inverse discrete cosine transform. CABAC is the entropy decoding method used in H.264 that is computationally intensive. The project develops hardware modules for these blocks to accelerate decoding and enable real-time performance. It presents the designs of the individual modules and simulation results showing their functionality. The goal is to improve on software implementations by using dedicated hardware for the critical decoding stages.
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...csandit
The video coding standards are being developed to satisfy the requirements of applications for
various purposes, better picture quality, higher coding efficiency, and more error robustness.
The new international video coding standard H.264 /AVC aims at having significant
improvements in coding efficiency, and error robustness in comparison with the previous
standards such as MPEG-2, H261, H263,and H264. Video stream needs to be processed from
several steps in order to encode and decode the video such that it is compressed efficiently with
available limited resources of hardware and software. All advantages and disadvantages of
available algorithms should be known to implement a codec to accomplish final requirement.
The purpose of this project is to implement all basic building blocks of H.264 video encoder and
decoder. The significance of the project is the inclusion of all components required to encode
and decode a video in MatLab .
The latest video compression standard, H.264 (also known as MPEG-4 Part 10/AVC for Advanced Video
Coding), is expected to become the video standard of choice in the coming years.
H.264 is an open, licensed standard that supports the most efficient video compression techniques available
today. Without compromising image quality, an H.264 encoder can reduce the size of a digital video file by
more than 80% compared with the Motion JPEG format and as much as 50% more than with the MPEG-4
Part 2 standard. This means that much less network bandwidth and storage space are required for a video
file. Or seen another way, much higher video quality can be achieved for a given bit rate.
This white paper discusses the H.264 video compression standard and its applications in video surveillance. H.264 provides much more efficient video compression than previous standards like MPEG-4 Part 2, reducing file sizes by over 50% while maintaining quality. This standard is well-suited for high-resolution, high frame rate surveillance applications where bandwidth and storage savings are most significant. While H.264 requires more powerful encoding and decoding hardware, it allows for higher quality surveillance at lower bit rates than previous standards.
The document discusses the H.264 video compression standard and its applications in video surveillance. H.264 provides much more efficient video compression than previous standards like MPEG-4 and Motion JPEG, reducing file sizes by over 80% without compromising quality. This allows for higher resolution, frame rate, and quality video streams using the same or lower bandwidth and storage compared to earlier standards. H.264 compression will enable uses like high frame rate surveillance at airports and casinos where bandwidth savings are most significant.
Requiring only half the bitrate of its predecessor, the new standard – HEVC or H.265 – will significantly reduce the need for bandwidth and expensive, limited spectrum. HEVC (H.265) will enable the launch of new video services and in particular ultra HD television (UHDTV).
State-of-the-art video compression techniques – HEVC/H.265 – can reduce the size of raw video by a factor of about 100 without any noticeable reduction in visual quality. With estimates indicating that compressed real-time video accounts for more than 50 percent of current network traffic, and this figure is set to rise to 90 percent within a few years, HEVC/H.265 will be a welcome relief for network operators.
New services, devices and changing viewing patterns are among the factors contributing to the growth in video traffic as people watch more and more traditional TV and video-streaming services on their mobile devices.
Ericsson has been heavily involved in the standardization of HEVC since it began in 2010, and this Ericsson Review article highlights some of the contributions that have led to the compression efficiency offered by HEVC.
Requiring only half the bitrate of its predecessor, the new standard – HEVC or H.265 – will significantly reduce the need for bandwidth and expensive, limited spectrum. HEVC (H.265) will enable the launch of new video services and in particular ultra HD television (UHDTV).
State-of-the-art video compression techniques – HEVC/H.265 – can reduce the size of raw video by a factor of about 100 without any noticeable reduction in visual quality. With estimates indicating that compressed real-time video accounts for more than 50 percent of current network traffic, and this figure is set to rise to 90 percent within a few years, HEVC/H.265 will be a welcome relief for network operators.
New services, devices and changing viewing patterns are among the factors contributing to the growth in video traffic as people watch more and more traditional TV and video-streaming services on their mobile devices.
Ericsson has been heavily involved in the standardization of HEVC since it began in 2010, and this Ericsson Review article highlights some of the contributions that have led to the compression efficiency offered by HEVC.
.
This document provides an overview and comparison of the H.265/HEVC and H.264/AVC video coding standards. It summarizes the key features and techniques of each, such as HEVC achieving around 40% higher data compression compared to H.264/AVC through improvements to prediction, transform coding, and entropy encoding. Experimental results testing various video sequences show HEVC provides significantly better compression efficiency. The document also reviews the technical details and implementations of both standards.
H.264 offers several technical advantages over MPEG-4 for video compression including finer-grained motion prediction, integer transforms, deblocking filters, and the ability to use multiple reference pictures. H.264 was designed to avoid the complex licensing issues of MPEG-4 and aims to not require royalty payments for its baseline profile. If H.264 can successfully avoid licensing controversies, it has the potential to see widespread adoption for uses beyond videoconferencing such as video streaming and storage.
This white paper discusses the H.264 video compression standard and its applications in video surveillance. It provides an introduction to H.264 and how it offers significantly higher compression rates than previous standards like MPEG-4 Part 2, reducing bandwidth and storage needs. It then explains how video compression works, the development of the H.264 standard, and how it supports different profiles and levels to optimize various applications and formats. The paper concludes that H.264 will be widely adopted and help enable higher resolution surveillance applications.
This white paper discusses the H.264 video compression standard and its applications in video surveillance. It provides an introduction to H.264 and how it offers significantly higher compression rates than previous standards like MPEG-4 Part 2, reducing bandwidth and storage needs. It then covers the development of H.264 as a joint project between telecommunications and IT organizations, and how it supports various applications. Finally, it briefly explains the basics of video compression and some key aspects of H.264, such as profiles and levels that define its capabilities and complexity.
H.264 is a new video compression standard that provides much more efficient compression than previous standards like MPEG-4 and Motion JPEG. It can reduce file sizes by 50-80% while maintaining the same quality. H.264 supports applications with different bandwidth and latency requirements. It uses various frame types and motion compensation techniques to reduce redundant data between frames. These techniques, along with an improved intra-frame prediction method, allow H.264 to compress video much more efficiently than prior standards.
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...DR.P.S.JAGADEESH KUMAR
The document evaluates the performance of H.264/AVC entropy coding (CABAC) for compressing different types of images. It is observed that CABAC is highly efficient at compressing compound images, achieving higher compression ratios and PSNR values compared to other image types at high bitrates. The proposed system compresses grayscale compound images using CABAC after applying Daubechies wavelet transform. Performance is measured using compression ratio and PSNR metrics at varying bits per pixel.
H.264 and HEVC are video compression standards. H.264 was developed in 2003 as an improvement over prior standards, allowing video compression at half the bit rate with the same quality. HEVC was developed in 2013 to replace H.264, providing around 50% better compression efficiency through features like larger block sizes, more intra-prediction modes, and adaptive transform sizes. The document provides details on the coding structures, prediction methods, and other techniques used in each standard.
Motion Vector Recovery for Real-time H.264 Video StreamsIDES Editor
Among the various network protocols that can be
used to stream the video data, RTP over UDP is the best to do
with real time streaming in H.264 based video streams. Videos
transmitted over a communication channel are highly prone
to errors; it can become critical when UDP is used. In such
cases real time error concealment becomes an important
aspect. A subclass of the error concealment is the motion
vector recovery which is used to conceal errors at the decoder
side. Lagrange Interpolation is the fastest and a popular
technique for the motion vector recovery. This paper proposes
a new system architecture which enables the RTP-UDP based
real time video streaming as well as the Lagrange
interpolation based real time motion vector recovery in H.264
coded video streams. A completely open source H.264 video
codec called FFmpeg is chosen to implement the proposed
system. Proposed implementation was tested against the
different standard benchmark video sequences and the
quality of the recovered videos was measured at the decoder
side using various quality measurement metrics.
Experimental results show that the real time motion vector
recovery does not introduce any noticeable difference or
latency during display of the recovered video.
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERcscpconf
This paper proposes about motion estimation in H.264/AVC encoder. Compared with standards
such as MPEG-2 and MPEG-4 Visual, H.264 can deliver better image quality at the same
compressed bit rate or at a lower bit rate. The increase in compression efficiency comes at the
expense of increase in complexity, which is a fact that must be overcome. An efficient Co-design
methodology is required, where the encoder software application is highly optimized and
structured in a very modular and efficient manner, so as to allow its most complex and time
consuming operations to be offloaded to dedicated hardware accelerators. The Motion
Estimation algorithm is the most computationally intensive part of the encoder which is simulated using MATLAB. The hardware/software co-simulation is done using system generator tool and implemented using Xilinx FPGA Spartan 3E for different scanning methods.
The H.264/AVC Advanced Video Coding Standard: Overview and ...Videoguy
This document provides an overview of the H.264/AVC video coding standard and its Fidelity Range Extensions (FRExt). It discusses how H.264/AVC was developed jointly by ISO/IEC MPEG and ITU-T VCEG to improve coding efficiency over prior standards. The FRExt amendment adds support for higher chroma sampling, bit depths, and other capabilities for demanding professional applications. Initial industry feedback indicates rapid adoption of the High Profile added in FRExt.
This paper proposes an adaptive energy management policy for wireless video streaming between a battery-powered client and server. It models the energy consumption of the server and client based on factors like CPU frequency, transmission power, and channel bandwidth. The paper formulates an optimization problem to assign optimal energy to each video frame. This maximizes system lifetime while meeting a minimum video quality requirement. Experimental results show the proposed policy increases overall system lifetime by 20% on average.
Microsoft PowerPoint - WirelessCluster_PresVideoguy
This document analyzes delays in unicast video streaming over IEEE 802.11 WLAN networks. It describes conducting an experiment using a testbed with a Darwin Streaming Server and WLAN probe to capture packets. The analysis found that video bitrate variations, packetization scheme, bandwidth load, and frame-based nature of video all impacted mean delay. Bursts of packets from video frames caused per-packet delay to increase in a sawtooth pattern. Increasing uplink load was also found to affect delay variations.
Proxy Cache Management for Fine-Grained Scalable Video StreamingVideoguy
This document proposes a novel video caching framework that uses MPEG-4 Fine-Grained Scalable (FGS) video with post-encoding rate control to achieve low-cost and fine-grained rate adaptation. The framework allows clients to have heterogeneous bandwidths and enables adaptive control of backbone bandwidth consumption. It examines issues in caching FGS videos, such as determining the optimal portion to cache (in terms of length and rate) and optimal streaming rate to clients. Simulation results show it significantly reduces transmission costs compared to non-adaptive caching while providing flexible utility to heterogeneous clients with low computational overhead.
The document compares Microsoft Windows Media and the Adobe Flash Platform for streaming media. It discusses key differences like user experience, workflows, and playback reach. Flash offers more flexibility in creative expression, richer interactions, and wider device playback than Windows Media. It also has a 98% install base, making it easier for viewers to watch streams without extra software. The document outlines workflows for experience design, programming, broadcasting, production, and more using Flash tools versus Microsoft alternatives.
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksVideoguy
This document summarizes a PhD thesis about free-riding resilient video streaming in peer-to-peer networks. The thesis contains research on two approaches: tree-based live streaming and swarm-based video-on-demand. For tree-based live streaming, the thesis presents the Orchard algorithm for constructing and maintaining trees to distribute video in a peer-to-peer network. It analyzes attacks on Orchard like free-riding and evaluates Orchard's performance under different conditions through experiments. For swarm-based video-on-demand, the thesis introduces the Give-to-Get approach for distributing video files and compares it to other peer-to-peer protocols. It evaluates Give-to-Get's performance in experiments
BT has developed Fastnets technology to improve video streaming. It avoids start-up delays and picture freezing during congestion. Fastnets streams multiple encoded versions of the video at different data rates and seamlessly switches between them based on available bandwidth to maintain quality without pausing. This allows for near-instant start times and reduces bandwidth usage by up to 30%. Fastnets provides a high-quality video streaming solution for both mobile and IPTV applications.
This document summarizes recent research on video streaming over Bluetooth networks. It discusses three key areas: intermediate protocols, quality of service (QoS) control, and media compression. For intermediate protocols, it evaluates streaming via HCI, L2CAP, and IP layers and their tradeoffs. For QoS control, it describes how error control mechanisms like link layer FEC, retransmission, and error concealment can improve video quality over Bluetooth. It also discusses congestion control. For media compression, it notes the importance of compression to achieve efficiency over limited Bluetooth bandwidths.
The document discusses video streaming, including definitions and concepts. It covers topics such as the difference between streaming and downloading, common streaming categories like live and on-demand, protocols used for streaming like RTSP and RTP, and the development process for creating streaming video including content planning, capturing, editing, encoding, and integrating with servers.
Inlet Technologies offers a live video streaming solution called Spinnaker that uses Intel Xeon processors with quad-core technology. Spinnaker can encode live video streams into multiple formats and resolutions simultaneously. This allows content to be delivered optimally to various devices. Spinnaker is a flexible, scalable solution that can increase broadcast capacity cost-effectively while maintaining high video quality.
Considerations for Creating Streamed Video Content over 3G ...Videoguy
The document discusses considerations for creating video content that can be streamed over mobile networks with restricted bandwidth like 3G-324M. It covers topics like video basics, codecs, profiles and levels, video streaming techniques, guidelines for authoring mobile-friendly content, and tools for analyzing video streams. The goal is to help content creators optimize video quality for low-bandwidth mobile viewing.
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGVideoguy
This document summarizes recent advances in channel-adaptive video streaming. It reviews adaptive media playout at the client to reduce latency, rate-distortion optimized packet scheduling to determine the best packet to send, and channel-adaptive packet dependency control to improve error robustness and reduce latency. It also discusses challenges for wireless video streaming and different wireless streaming architectures.
Impact of FEC Overhead on Scalable Video StreamingVideoguy
The document discusses the impact of forward error correction (FEC) overhead on scalable video streaming. It aims to address uncertainty about the benefits of FEC and provide insight into how FEC overhead affects scalable video performance. The motivation section explains that FEC is often used for streaming to overcome packet loss without retransmission. However, previous studies have reported conflicting results on the benefits of FEC. The background section provides details on media-independent FEC schemes.
The document proposes a cost-effective solution for video streaming and rich media applications using Vela's RapidAccess video server combined with iQstor's iQ1200 SATA storage system. The integrated encoding, decoding and video serving capabilities of RapidAccess are paired with the scalable storage and virtualization features of the iQ1200 SATA storage array to provide a robust yet affordable infrastructure for applications such as video on demand, corporate training and distance learning.
This document provides information on streaming video into Second Life, including:
- The basic prerequisites for streaming video include being the landowner, using QuickTime format videos, and having the video hosted on a web server.
- There are three main ways to stream video: establishing movie playback, streaming live video, and broadcasting from Second Life.
- Streaming live video or broadcasting involves using software like QuickTime Broadcaster or Windows Media Encoder to capture the video stream and send it to a hosting server, then entering that URL in Second Life.
XStream Live 2 is a live video encoding and streaming software that allows users to broadcast high quality HD video at low bitrates. It supports various video formats and streaming servers. The software provides high quality H.264 encoding with proprietary technology. It is designed for live event streaming, IPTV, and other video distribution uses.
The document provides instructions for setting up a homemade videoconference streaming solution using Windows Media software. The solution involves installing Windows Media Encoder and Administrator on a server and configuring the software to receive a video stream from a videoconferencing terminal. The streaming server then broadcasts the stream in real-time to clients who can view it using media player software. The solution provides a low-cost way to stream videoconferences but has limitations such as only supporting one conference stream at a time.
This document describes iStream Live 2 software for live streaming video to iPhones and iPads. It allows streaming of SD or HD video over HTTP from a variety of video sources. Key features include support for all major CDNs, encoding of H.264 video and AAC audio for high quality at low bitrates, and integration with existing Windows streaming systems. It provides better quality streaming than other encoders at lower bandwidth requirements.
Glow: Video streaming training guide - FirefoxVideoguy
This document provides a guide to using Glow video streaming. It includes tutorials on setting up video streaming by adding the Video Streaming Management web part, uploading video clips, viewing clips, editing clip information, and deleting clips. The guide also discusses how video streaming can be used to support learning and teaching, such as adding videos to lessons.
1. EE 5359 PROPOSAL
H.264 to VC-1 TRANSCODING
Vidhya Vijayakumar
Student I.D.: 1000-622152
Date: September 24, 2009
1
2. H.264 to VC-1 TRANSCODER
OBJECTIVE:
The objective of the thesis is to implement a H.264 bitstream to VC-1
transcoder for progressive compression.
MOTIVATION:
The high definition video adoption has been growing rapidly for the last five
years. The high definition DVD format Blue ray has mandated MPEG-2[3], H.264 [2]
and VC-1 [1] as video compression formats. The coexistence of these different video
coding standards creates a need for transcoding. As more and more end products use
the above standards, transcoding from one format to another adds value to the
product’s capability. While there has been recent work on MPEG-2 to H.264
transcoding [3], VC-1 to H.264 transcoding [4], the published work on H.264 to VC-1
transcoding is nearly non-existent. This has created the motivation to develop a
transcoder that can efficiently transcode a H.264 bitstream to a VC-1 bitstream.
DETAILS:
Video transcoding is the operation of converting video from one format to
another [5]. A format is defined by characteristics such as bit-rate, spatial resolution
etc. One of the earliest applications of transcoding is to adapt the bit-rate of a
compressed stream to the channel bandwidth for universal multimedia access in all
kinds of channels like wireless networks, Internet, dial-up networks etc. Changes in
the characteristics of an encoded stream like bit-rate, spatial resolution, quality etc can
also be achieved by scalable video coding [5].However, in cases where the available
network bandwidth is insufficient or if it fluctuates with time, it may be difficult to set
the base layer bit-rate. In addition, scalable video coding demands additional
complexities at both the encoder and the decoder.
The basic architecture for converting an H.264 bitstream into a VC-1
elementary stream arises from complete decoding of the H.264 stream and then re-
encoding into a VC-1 stream. However, this involves significant computational
complexity [6]. Hence there also is a need to transcode at low complexity.
Transcoding can in general be implemented in the spatial domain or in the
transform domain or in a combination of the two domains. The common transcoding
architectures [5] are:
Open loop transform domain transcoding
Fig. 1 Open loop transform domain transcoder architecture [5]
2
3. Open loop transcoders are computationally efficient (Fig 1). They operate in the DCT
domain. However they are subject to drift error. Drift error occurs due to rounding,
quantization loss and clipping functions.
Cascaded Pixel Domain Architecture (CPDT)
Fig. 2 Cascaded pixel domain transcoder architecture [5]
This is the most basic transcoding architecture (Fig 2). The motion vectors from the
incoming bit stream are extracted and reused. Thus the complexity of the motion
estimation block is eliminated which accounts for 60% of the encoder computation.
As compared to the previous architecture, CPDT is drift free. Hence, even though it is
slightly more complex, it is suited for heterogeneous transcoding between different
standards where the basic parameters like mode decisions, motion vectors etc are to
be re-derived.
Simplified DCT Domain transcoders (SDDT)
Fig. 3 Simplified transform domain transcoder architecture [5]
This transcoder is based on the assumption that DCT, IDCT and motion compensation
are linear processes (Fig 3). This architecture requires that motion compensation be
performed in the DCT domain, which is a major computationally intensive operation
[3]. For instance, as shown in the figure 4, the goal is trying to compute the DCT
3
4. coefficients of the target block B from the four overlapping blocks B1, B2, B3 and
B4.
Fig. 4 Transform domain motion compensation illustration [5]
Also, clipping functions and rounding operations performed for interpolation
in fractional pixel motion compensation lead to a drift in the transcoded video.
Cascaded DCT Domain transcoders (CDDT)
Fig. 5 Cascaded transform domain transcoder architecture [5]
This is used for spatial/temporal resolution downscaling and other coding parameter
changes (Fig 5). As compared with SDDT, greater flexibility is achieved by
introducing another transform domain motion compensation block; however it is far
more computationally intensive and requires more memory [3]. It is often applied to
downscaling applications where the encoder end memory will not cost much due to
downscaled resolution.
4
5. Choice of basic transcoder architecture:
DCT domain transcoders have the main drawback that motion compensation
in transform domain is very computationally intensive. DCT domain transcoders are
also, less flexible as compared to pixel domain transcoders, for instance, the SDDT
architecture can only be used for bit rate reduction transcoding. It assumes that the
spatial and temporal resolutions stay the same and that the output video uses the same
frame types, mode decisions and motion vectors as the input video.
For H.264 to VC-1 transcoding, it is required to implement several changes in
order to accommodate the mismatches between the two standards. For instance, for
motion estimation and compensation, H.264 supports 16x16, 16x8, 8x16, 8x8, 8x4,
4x8, 4x4 macroblock partitions (Fig 6), but VC-1 supports 16x16 and 8x8 only (Fig
7). The transform size and type (8x8 and 4x4 in H.264 and 8x8, 4x8, 8x4 and 4x4 in
VC-1) are different and make transform domain transcoding prohibitively complex.
Hence, the use of DCT domain transcoders is not very ideal.
Fig.6 Segmentations of the macroblock for motion compensation in H.264
Top: segmentation of macroblocks, bottom: segmentation of 8x8 partitions. [2]
Fig.7 Segmentations of the macroblock for motion compensation in VC-1 [2]
From Fig. 8, it can be inferred that, the cascaded pixel domain architecture
outperforms the DCT domain transcoders. Also for larger GOP sizes, the drift in DCT
domain transcoders becomes more significant.
5
6. Fig.8 PSNR vs Bit-rate graph for the Foreman sequence transcoded with a GOP size 15, using
different transcoding architectures as described in Figs. 1, 2, 3 and 5. [5]
Hence, heterogeneous transcoding in the pixel domain is preferred for
standards transcoding.
Standards transcoding:
When transcoding between two different standards, the main factor involved is
compatibility between the profile and level of the input stream and that of the output
stream for a specific purpose. The goal here is to transcode a H.264 bitstream of
Baseline profile to VC-1 bit stream of Simple profile.
The table 1 compares and contrasts the characteristics of both standards
H.264 High Profile VC-1 Main Profile
Chroma Format 4:2:0 4:2:0
Picture coding type I ,P ,B I ,P ,B
Transform size 4x4, 8x8 8x8, 4x8, 8x4, 4x4
Intra Prediction Directional Predictors None
Block sizes for Motion 16x16, 16x8, 8x16, 8x8, 4x8, 16x16, 8x8
Compensation 8x4, 4x4
Table 1 Main characteristics of H.264 Main profile and VC-1 Main profile
Overview of H.264:
H.264 [2] is a standard for video compression, and is equivalent to
MPEG-4 Part 10, or MPEG-4 AVC (for advanced video coding) (Fig 9). As of 2008,
it is the latest block-oriented motion-compensation-based video standard developed
by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC
6
7. Moving Picture Experts Group (MPEG), and it was the product of a partnership effort
known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC
MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are jointly maintained so
that they have identical technical content.
Fig 9 H.264 Encoder [32]
Fig 10. H.264 Decoder [32]
The standardization of the first version of H.264/AVC was completed in May
2003. The JVT then developed extensions to the original standard that are known as
the Fidelity Range Extensions (FRExt) [29]. These extensions enable higher quality
video coding by supporting increased sample bit depth precision and higher-resolution
color information, including sampling structures known as YUV 4:2:2 and YUV
4:4:4. Several other features are also included in the Fidelity Range Extensions
project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-
specified perceptual-based quantization weighting matrices, efficient inter-picture
lossless coding, and support of additional color spaces. The design work on the
Fidelity Range Extensions was completed in July 2004, and the drafting work on them
was completed in September 2004.
Scalable video coding (SVC) [30] as specified in Annex G of H.264/AVC
allows the construction of bitstreams that contain sub-bitstreams that conform to
H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream
with a smaller temporal sampling rate than the bitstream, complete access units are
removed from the bitstream when deriving the sub-bitstream. In this case, high-level
syntax and inter prediction reference pictures in the bitstream are constructed
accordingly. For spatial and quality bitstream scalabilities, i.e. the presence of a sub-
bitstream with lower spatial resolution or quality than the bitstream, network
7
8. abstraction layer (NAL) units are removed from the bitstream when deriving the sub-
bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial
resolution or quality signal by data of the lower spatial resolution or quality signal, is
typically used for efficient coding. The Scalable Video Coding extension was
completed in November 2007.
Some of the features adopted in H.264 for enhancement of prediction, improved
coding efficiency and robustness to data errors/losses are listed as follows.
Features for enhancement of prediction
• Directional spatial prediction for intra coding
• Variable block-size motion compensation with small block size
Figure 11 – Various block sizes in H.264
• Quarter-sample-accurate motion compensation
• Motion vectors over picture boundaries
• Multiple reference picture motion compensation
• Decoupling of referencing order from display order
• Decoupling of picture representation methods from picture referencing
capability
• Weighted prediction
• Improved “skipped” and “direct” motion inference
• In-the-loop deblocking filtering
Features for improved coding efficiency
• Small block-size transform
• Exact-match inverse transform
8
9. Figure – Forward 4x4 and 8x8 integer transform
• Short word-length transform
• Hierarchical block transform
• Arithmetic entropy coding
• Context-adaptive entropy coding
Features for robustness to data errors/losses
• Parameter set structure
• NAL unit syntax structure
• Flexible slice size
• Flexible macroblock ordering (FMO)
• Arbitrary slice ordering (ASO)
• Redundant pictures
• Data partitioning
• SP/SI synchronization/switching pictures
Profiles in H.264
H.264 standard defines numerous profiles.
• Constrained baseline profile
• Baseline
• Main profile
• Extended profile
• High profile
9
10. • High 10 profile
• High 4:2:2 profile
• High 4:4:4 predictive profile
• High stereo profile
• High 10 intra profile
• High 4:2:2 intra profile
• High 4:4:4 intra profile
• CAVLC 4:4:4 intra profile
• Scalable baseline profile
• Scalable high profile
• Scalable high intra profile
Table Features in baseline, main and extended profile
Table Features in high profile
10
11. High Profiles
Adaptive transform block size
Extended Profile Quantization scaling matrices
Main Profile
CABAC
Data partition
B slice
SI slice Weighted prediction
SP slice
I slice
P slice
CAVLC
Arbitrary slice order
Flexible macroblock order
Redundant slice
Baseline Profile
Figure 12 Comparison of H.264 baseline, main, extended and high profile
Overview of VC-1
VC-1 [1] is the informal name of the SMPTE 421M video codec standard
initially developed by Microsoft. It was released on April 3, 2006 by SMPTE. It is
now a supported standard for Blu-ray Discs, and Windows Media Video 9.
VC-1 is an evolution of the conventional DCT-based video codec design also
found in H.261 [31], H.263 [27], MPEG-1[40] and MPEG-2[3]. It is widely
characterized as an alternative to the latest ITU-T and MPEG video codec standard
known as H.264/MPEG-4 AVC. VC-1 contains coding tools for interlaced video
sequences as well as progressive encoding. The main goal of VC-1 development and
standardization is to support the compression of interlaced content without first
converting it to progressive, making it more attractive to broadcast and video industry
professionals.
The VC-1 codec is designed to achieve state-of-the-art compressed video
quality at bit rates that may range from very low to very high. The codec can easily
handle 1920 pixel × 1080 pixel resolution at 6 to 30 megabits per second (Mbps) for
high-definition video. VC-1 is capable of higher resolutions such as 2048 pixels ×
1536 pixels for digital cinema, and of a maximum bit rate of 135 Mbps. An example
of very low bit rate video would be 160 pixel × 120 pixel resolution at 10 kilobits per
second (Kbps) for modem applications.
11
12. The basic functionality of VC-1 involves a block-based motion compensation
and spatial transform scheme similar to that used in other video compression
standards such as MPEG-1 and H.261 [31]. However, VC-1 includes a number of
innovations and optimizations that make it distinct from the basic compression
scheme, resulting in excellent quality and efficiency. VC-1 Advanced Profile is also
transport independent. This provides even greater flexibility for device manufacturers
and content services.
Fig. 11 VC – 1 Codec [32]
Profiles in VC-1
VC-1 defines three profiles
1. Simple
2. Main
3. Advanced
Simple Main Advanced
Baseline intra frame
Yes Yes Yes
compression
Variable-sized transform Yes Yes Yes
16-bit transform Yes Yes Yes
Overlapped transform Yes Yes Yes
4 motion vector per
Yes Yes Yes
macroblock
12
13. ¼ pixel luminance motion
Yes Yes Yes
compensation
¼ pixel chrominance motion
No Yes Yes
compensation
Start codes No Yes Yes
Extended motion vectors No Yes Yes
Simple Main Advanced
Loop filter No Yes Yes
Dynamic resolution change No Yes Yes
Adaptive macroblock
No Yes Yes
quantisation
B frames No Yes Yes
Intensity compensation No Yes Yes
Range adjustment No Yes Yes
Field and frame coding modes No No Yes
GOP Layer No No Yes
Display metadata No No Yes
Table – Features in VC-1 profiles [49]
Innovations
13
14. VC-1 includes a number of innovations that enable it to produce high quality
content. This section provides brief descriptions of some of these features.
Adaptive Block Size Transform
Traditionally, 8 × 8 transforms have been used for image and video coding.
However, there is evidence to suggest that 4 × 4 transforms can reduce ringing
artifacts at edges and discontinuities. VC-1 is capable of coding an 8 × 8 block using
either an 8 × 8 transform, two 8 × 4 transforms, two 4 × 8 transforms, or four 4 × 4
transforms. This feature enables coding that takes advantage of the different transform
sizes as needed for optimal image quality.
Figure – VC-1 transform sizes [4]
16-Bit Transforms
In order to minimize the computational complexity of the decoder, VC-1 uses
16-bit transforms. This also has the advantage of easy implementation on the large
amount of digital signal processing (DSP) hardware built with 16-bit processors.
Among the constraints put on transforms specified in VC-1 is the requirement that the
16-bit values used produce results that can fit in 16 bits. The constraints on transforms
ensure that decoding is as efficient as possible on a wide range of devices.
Motion Compensation
Motion compensation is the process of generating a prediction of a video
frame by displacing the reference frame. Typically, the prediction is formed for a
block (an 8 × 8 pixel tile) or a macroblock (a 16 × 16 pixel tile) of data. The
displacement of data due to motion is defined by a motion vector, which captures the
shift along both the x- and y-axes.
Figure VC-1 motion compensation sizes [4]
14
15. The efficiency of the codec is affected by the size of the predicted block, the
granularity of sub-pixel data that can be captured, and the type of filter used for
generating sub-pixel predictors. VC-1 uses 16 × 16 blocks for prediction, with the
ability to generate mixed frames of 16 × 16 and 8 × 8 blocks. The finest granularity of
sub-pixel information supported by VC-1 is 1/4 pixel. Two sets of filters are used by
VC-1 for motion compensation. The first is an approximate bicubic filter with four
taps. The second is a bilinear filter with two taps. The four-tap bicubic filters used in
VC-1 for ¼ and ½ pixel shifts are: [-4 53 18 -3]/64 and [-1 9 9 -1]/16.
Figure – Integer, half and quarter pel positions [2]
(A-Q Integer, aa-hh half, a-s quarter pel positions)
VC-1 combines the motion vector settings defined by the block size, sub-
pixel resolution, and filter type into modes. The result is four motion compensation
modes that suit a range of different situations. This classification of settings into
modes also helps compact decoder implementations.
Loop Filtering
VC-1 uses an in-loop deblocking filter that attempts to remove block-
boundary discontinuities introduced by quantization errors in interpolated frames.
These discontinuities can cause visible artifacts in the decompressed video frames and
can impact the quality of the frame as a predictor for future interpolated frames.
15
16. Figure – Loop filtering in VC-1 [4] (Only pixel p4 and p5 are filtered)
The loop filter takes into account the adaptive block size transforms. The filter
is also optimized to reduce the number of operations required.
Interlaced Coding
Interlaced video content is widely used in television broadcasting. When
encoding interlaced content, the VC-1 codec can take advantage of the characteristics
of interlaced frames to improve compression. This is achieved by using data from
both fields to predict motion compensation in interpolated frames.
Advanced B Frame Coding
A bi-directional or B frame is a frame that is interpolated from data both in
previous and subsequent frames. B frames are distinct from I frames (also called key
frames), which are encoded without reference to other frames. B frames are also
distinct from P frames, which are interpolated from previous frames only. VC-1
includes several optimizations that make B frames more efficient. VC-1 does not have
a fixed group of pictures (GOP) structure and the number of pictures in a GOP can
vary.
Fading Compensation
Due to the nature of compression that uses motion compensation, encoding of
video frames that contain fades to or from black is very inefficient. With a uniform
fade, every macroblock needs adjustments to luminance. VC-1 includes fading
compensation, which detects fades and uses alternate methods to adjust luminance.
This feature improves compression efficiency for sequences with fading and other
global illumination changes.
Differential Quantization
Differential quantization, or dquant, is an encoding method in which multiple
quantization steps are used within a single frame. Rather than quantize the entire
frame with a single quantization level, macroblocks are identified within the frame
that might benefit from lower quantization levels and greater number of preserved AC
16
17. coefficients. Such macroblocks are then encoded at lower quantization levels than the
one used for the remaining macroblocks in the frame. The simplest and typically most
efficient form of differential quantization involves only two quantizer levels (bi-level
dquant), but VC-1 supports multiple levels, also.
MAPPING DIFFERENCES BETWEEN THE TWO STANDARDS:
The transcoding algorithm considered in this research assumes full H.264
decoding down to the pixel level, followed by a reduced complexity VC-1 encoding.
The data gathered during the H.264 decoding stage is used to accelerate the VC-1
encoding stage. It is assumed that the H.264 encoded bitstream is generated with an
R-D optimized encoder. The picture coding types used are similar in both the
standards. The transform size and type are different and makes transform domain
transcoding prohibitively complex. The semantics of intra MBs are similar except for
the intra directional prediction allowed in H.264 and the mixed MBs in VC-1. The
inter prediction has significant differences including the block size of MC, block size
of transform, and reference frames used. These similarities between the codecs can be
exploited in reducing the transcoding complexity.
Intra MB Mode Mapping:
An intra MB in the incoming H.264 bitstream is coded as a VC-1 intra MB. A
H.264 intra MB can be coded as Intra 4x4 (9 different directional modes) or Intra
16x16 (4 different modes). But a VC-1 intra MB has four 8x8 blocks and has no
prediction modes. Since intra MB in VC-1 uses 8x8 transform, irrespective of the
block size (16x16 or 4x4) in H.264, we need not carry over the information of the
intra prediction type in H.264. Table 2 shows the proposed intra MB mapping.
H.264 Intra MB VC-1 Intra MB
Intra 16x16 (Any mode) Intra MB 8x8
Intra 4x4 (Any mode) Intra MB 8x8
Table 2 H.264 and VC-1 Intra MB mapping
Figure – Matrix for one-dimensional 8-point inverse transform [32]
Inter MB Mode Mapping:
17
18. An inter coded MB in the incoming H.264 bitstream is coded as inter MB in
VC-1. The inter MB in H.264 has 7 different motion compensation sizes – 16x16,
16x8, 8x16, 8x8, 4x8, 8x4, 4x4. The inter MB in VC-1 has 2 different motion
compensation sizes 16x16 and 8x8. Another significant difference is that H.264 uses
4x4 (and 8x8 in fidelity range extensions) transform sizes where as VC-1 uses 4
different transform sizes – 8x8, 4x8, 8x4 and 4x4.
The 16x16, 8x16, 16x8 motion compensation sizes are usually selected in
H.264 for areas that are relatively uniform and will be mapped to inter 16x16 MB in
VC-1 using the selected H.264 MC block size as a measure of homogeneity in the
block to be able to differentiate the transform size to be applied in VC-1.
The 8x8, 8x4, 4x8 and 4x4 modes are usually selected in H.264 for areas that
have non-uniform motion. The 16x16 mode in VC-1 is eliminated for such non-
uniform MBs. The MB is then mapped to 8x8 block size in VC-1 with the H.264
block size determining the transform size to be used in VC-1.
Table 3 describes the decision making for mapping the inter MBs and the type of
transform to be used in VC-1.
H.264 Inter MB VC-1 Inter MB Transform size in VC-1
Inter 16x16 Inter 16x16 8x8
Inter 16x8 Inter 16x16 8x4
Inter 8x16 Inter 16x16 4x8
Inter 8x8 Inter 8x8 8x8
Inter 4x8 Inter 8x8 4x8
Inter 8x4 Inter 8x8 8x4
Inter 4x4 Inter 8x8 4x4
Table 3 H.264 and VC-1 Inter MB mapping and VC-1 transform type
Motion vector mapping:
Re-use of motion vectors selected in H.264 can significantly reduce the complexity of
VC-1 encoding. Table 4 describes the re-use of motion vectors.
H.264 Inter MB VC-1 Inter MB Motion Vector Re-use
Inter 16x16 Inter 16x16 Same motion vectors
Inter 16x8 Inter 16x16 Average of motion vectors
Inter 8x16 Inter 16x16 Average of motion vectors
Inter 8x8 Inter 8x8 Same motion vectors
Inter 4x8 Inter 8x8 Average of motion vectors
Inter 8x4 Inter 8x8 Average of motion vectors
Inter 4x4 Inter 8x8 Average of motion vectors
Table 4 H.264 and VC-1 Inter MB motion vector mapping
18
19. Reference Pictures:
H.264/AVC standard defines the use of up to sixteen reference pictures for motion
estimation, while VC-1 uses only one or two, according to the slice type P or B
respectively. The reuse of motion vectors implies using the same reference pictures to
maintain their meaning. The motion vector conversion assumes that motion vector
length is related to the reference image distance [39]. The source motion vectors are
scaled, according to figure 12 in order to use valid VC-1 reference pictures. This
conversion assumes constant motion between H.264/AVC and VC-1 reference
pictures. The motion vector conversion is performed by scaling it with the temporal
distance between the two reference pictures.
H.264
VC-1
Fig 12 Motion vector scaling [38]
Skipped Macroblock:
When a skipped macro block is signaled in the bit stream, no further data is sent for
that macro block. The mode conversion of H.264 skip macroblocks to VC-1 skip is a
straightforward process. Since the skip macro block definition of both standards is
fully compatible, a direct conversion is possible.
OPEN LOOP TRANSCODER:
The open loop transcoder is designed by cascading a H.264 encoder [44], H.264 [44]
decoder, VC-1 encoder [45] and a VC-1 decoder [45].
YUV H.264 Encoder H.264 Decoder VC-1 Encoder VC-1 Decoder YUV
Fig 13 Open loop transcoder
Performance of open loop transcoder
Mean square error (MSE), peak-to-peak signal to noise ratio (PSNR), structural
similarity index measure (SSIM) for Foreman QCIF (3 frames) is calculated using the
open loop transcoder.
19
20. Fig 14 MSE of open loop transcoder – Foreman sequence
Fig 15 PSNR of open loop transcoder – Foreman sequence
20
21. Fig 16 SSIM of open loop transcoder – Foreman sequence
CONCLUSIONS:
As mentioned earlier, it is proposed to transcode an H.264 bitstream to a VC-1
stream in the pixel domain (CPDT) and compare the results (MSE, PSNR, SSIM,
complexity, bit rates) against an open loop transcoder. On the encoder side, since
there is no re-estimation of the motion vectors, the complexity on the encoder side
reduces by about 40-50%. Road map ahead is to extract re-usable information from
the H.264 bitstream to be used in VC-1 encoding.
REFERENCES:
[1] VC-1 Compressed Video Bitstream Format and Decoding Process (SMPTE
421M-2006), SMPTE Standard, 2006.
[2] T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE
Trans. CSVT, Vol. 13, pp. 560-576, July 2003.
[3] C. Chen, P-H.Wu and H. Chen, “MPEG-2 to H.264 transcoding,” Picture Coding
Symposium, pp. 15-17 Dec, 2004.
[4] Jae-Beom Lee and H. Kalva, "An efficient algorithm for VC-1 to H.264 video
transcoding in progressive compression," IEEE International Conference on
Multimedia and Expo, pp. 53-56, July 2006
[5] J Xin, C.W. Lin and M.T. Sun, “Digital video transcoding”, Proceedings of the
IEEE, Vol. 93, pp 84-97, Jan 2005.
[6] A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and
techniques: An overview”, IEEE Signal Processing Magazine, Vol. 20, pp 18-29,
March 2003.
[7] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 /
ISO / IEC 14496-10, Mar 2005.
[8] S. Srinivasan and S. L. Regunathan, “An overview of VC-1” Proc. SPIE, vol.
5960, pp. 720–728, 2005.
[9] P. List et al, “Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video
Technol., vol. 13, pp.614–619, Jun. 2003.
[10]T. D. Tran, J. Liang and C. Tu, “Lapped transform via time-domain pre- and post-
filtering,” IEEE Trans. Signal Proc., vol. 51, pp. 1557–1571, Jun. 2003.
21
22. [11]C. C. Cheng, T. S. Chang, and K. B. Lee, “An in-place architecture for the
deblocking filter in H.264/AVC,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol.
53, pp. 530–534, Jul. 2006.
[12]T. C. Chen et al “Analysis and architecture design of an HDTV720p 30 frames/s
H.264/AVC encoder,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp. 673
– 688, Jun. 2006.
[13]Y.-W. Huang et al “Architecture design for deblocking filter in H.264 / JVT /
AVC,” in IEEE Proc. Int. Conf. Multimedia and Expo, pp. 693–696, July 2003.
[14]S.-C. Chang et al “A platform based bus-interleaved architecture for de-blocking
filter in H.264/MPEG-4 AVC,” IEEE Trans. Consumer Electron., vol. 51, pp.
249–255, Feb 2005.
[15]M. Sima, Y. Zhou, and W. Zhang, “An efficient architecture for adaptive
deblocking filter of H.264/AVC video coding,” IEEE Trans. Consumer
Electronics, vol. 50, pp. 292–296, Feb. 2004.
[16]S.-Y. Shih, C.-R. Chang and Y.-L. Lin, “A near optimal deblocking filter for
H.264 advanced video coding” in Proc. Asia and South Pacific Design
Automation Conf., pp. 170–175, Jan 2006.
[17]T.-M. Liu et al, “A memory-efficient deblocking filter for H.264/AVC video
coding,” in Proc. IEEE Int. Symp. Circuits Syst., pp. 2140–2143, May 2005.
[18]T.-M. Liu et al, “A 125 µ W fully scalable MPEG-2 and H.264/AVC video
decoder for mobile applications,” IEEE J. Solid-State Circuits, vol. 42, pp. 161–
169, Jan. 2007.
[19]L. Li, S. Goto and T. Ikenaga, “An efficient deblocking filter architecture with 2-
dimensional parallel memory for H.264/AVC,” in Proc. Asia and South Pacific
Design Automation Conf., pp.623–626, 2005
[20]H.-Y. Lin et al “Efficient deblocking filter architecture for H.264 video coders,”
in IEEE ISCAS, pp 4, May 2006
[21]T.-M. Liu, W.-P. Lee and C.-Y. Lee, “An in/post-loop deblocking filter with
hybrid filtering schedule” IEEE Trans. Circuits Syst. for Video Technol., vol. 17,
pp. 937–943, Jul. 2007.
[22]I. Ahmad et al, “Video transcoding: An overview of various techniques and
research Issues”, IEEE Trans. on Multimedia, vol. 7, pp. 793-8, Oct. 2005
22
23. [23]Y.L Lee and T.Q Nguyen, "Analysis and efficient architecture design for VC-1
overlap smoothing and in-loop deblocking Filter," IEEE Trans Circuits and Syst.
for Video Technol, vol.18, pp 1786-1796, Dec. 2008
[24]G. Fernandez-Escribano et al, “Speeding-up the macroblock partition mode
decision for MPEG-2 to H.264 transcoding,” Proceedings of IEEE ICIP 2006,
Atlanta, pp 869-872, Sept 2006.
[25]Z. Zhou et al "Motion information and coding mode reuse for MPEG-2 to H.264
transcoding", Proceedings of the IEEE ISCAS 2005, pp 1230-1233, May 2005.
[26]B. Petljanski and H. Kalva, “DCT domain intra MB mode decision for MPEG-2
to H.264 transcoding” Proceedings of the IEEE ICCE 2006, pp. 419-420, Jan
2006.
[27]J. Bialkowski, A. Kaup and K. Illgner, “Fast transcoding of intra frames between
H.263 and H.264,” IEEE ICIP, vol.4, pp. 2785- 2788, Oct 2004.
[28]Y.-K. Lee, S.-S. Lee, and Y.-L. Lee, “MPEG-4 to H.264 transcoding using
macroblock statistics,” Proceedings of the IEEE ICME 2006, pp.57-60, Toronto,
Canada, July 2006.
[29]G. Sullivan, P. Topiwalla and A. Luthra, “The H.264/AVC video coding
standard: overview and introduction to the fidelity range extensions”, SPIE
Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp.
53-74 Aug 2004.
[30]T. Weigand et al, “Introduction to the Special Issue on Scalable Video Coding—
Standardization and Beyond” IEEE Trans on Circuits and Systems for Video
Technology, Vol 17, pp 1034, Sept 2007.
[31]Von Roden and T. Praktische, “H.261 and MPEG1- A comparison” Conference
Proceedings of the 1996 IEEE Fifteenth Annual International Phoenix Conference
on Computers and Communications, pp.65-71, Mar 1996
[32]S. Srinivasan et al, “Windows Media Video 9: overview and applications” Signal
Processing: Image Communication, Vol 19, pp 851-875, Oct 2004.
[33]S. K. Kwon, A. Tamhankar and K.R. Rao, "An overview of H.264/MPEG-4 Part
10," Special issue of Journal of Visual Communication and Image
Representation,vol.17, pp 186-216, April 2006.
[34]G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp
60-76, Jan 2006.
23
24. [35]J. Bialkowski, M Barkowky and A. Kaup, “Overview of low complexity video
transcoding from H.263 to H.264” IEEE ICME, pp 49-52, 2006.
[36]T. D. Nguyen et al, “Efficient MPEG-4 to H.264/AVC transcoding with spatial
downscaling”, ETRI Journal, vol.29, no.6, pp 826-828, Dec. 2007.
[37]H. Kalva, G.F. Escribano and K Kunzelmann, “Reduced resolution MPEG-2 to
H.264 transcoder” Proc. SPIE, Vol. 7257, 72571V Jan 2009.
[38]S Moiron et al, "H.264/AVC to MPEG-2 video transcoding architecture", Proc
Conf. on Telecommunications - ConfTele, Peniche, Portugal, Vol. 1, pp. 449 -
452, May, 2007.
[39]S Moiron et al, “Video transcoding from H.264/AVC to MPEG-2 with reduced
computational complexity”, Signal Processing: Image Communication, vol 24, pp
637-650, September 2009
[40]Mei-Juan Chen, Ming-Chung Chu and Chih-Wei Pan, “Efficient motion-
estimation algorithm for reduced frame-rate video transcoder”, IEEE Trans on
Circuits and Systems for Video Technology, vol. 12, pp. 269–275, Apr. 2002.
[41]ISO/IEC 11172-2:1993 Information technology -- Coding of moving pictures and
associated audio for digital storage media at up to about 1,5 Mbits/s -- Part 2:
Video
[42]H. Kalva and J.B. Lee, "The VC-1 Video Coding Standard," IEEE Multimedia,
vol. 14, pp. 88-91, Oct.-Dec. 2007
[43]P. Bordes, A. Orhand, “Improved Algorithm for fast transcoding H.264”
EUSIPCO 2007.
REFERENCE BOOKS:
[44]K. Sayood, “Introduction to Data compression”, III edition, Morgan
Kauffmann publishers, 2006.
[45]I.E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for
next-generation multimedia”, Wiley, 2003.
24
25. [46]K. R. Rao and P. C. Yip, “The transform and data compression handbook”,
Boca Raton, FL: CRC press, 2001.
[47]K.R. Rao and J.J. Hwang “Techniques and Standards for Image, Video, and
Audio Coding” - Prentice Hall, 1996.
[48]J.B. Lee and H. Kalva, The VC-1 and H.264 Video Compression Standards
for Broadband Video Services, Springer, 2008.
REFERENCE WEBSITES:
[49]JM software : http://iphome.hhi.de/suehring/tml/
[50]VC-1 Software : http://www.smpte.org/home
[51]Microsoft website - VC-1 Technical Overview
http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx#VC1C
omparedtoOtherCodecs
[52]VC-1 Wikipedia site - http://en.wikipedia.org/wiki/VC-1
[53]
ACRONYMS:
ASO Arbitrary slice ordering
AVC Advanced Video Coding
B MB Bi-predicted MB
CDDT Cascaded DCT Domain Transcoder
CPDT Cascaded Pixel Domain Transcoder
DCT Discrete Cosine Transform
DSP Digital Signal Processing
DVD Digital Versatile Disc
FMO Flexible macroblock ordering
FRExt Fidelity Range Extensions
GOP Group Of Pictures
I MB Intra Predicted MB
IEC International Electrotechnical Commission
ISO International Organization for Standardization
ITU-T International Telecommunication Union – Transmission
sector
JVT Joint Video Team
P MB Inter Predicted MB
IDCT Inverse Discrete Cosine Transform
IQ Inverse Quantizer
MB Macroblock
25
26. ME Motion Estimation
MC Motion Compensation
MV Motion Vector
MPEG Moving Picture Experts Group
MSE Mean Square Error
PSNR Peak –to – peak Signal to Noise Ratio
Q Quantizer
R-D Rate - Distortion
SDDT Simplified DCT Domain Transcoder
SP/SI Switched P / Switched I
SMPTE Society of Motion Picture and Television Engineers
SSIM Structural Similarity Index Measure
SVC Scalable Video Coding
VCEG Video Coding Experts Group
VLC Variable Length Coding
VLD Variable Length Decoder
YUV Y- Luminance and UV- Chrominance
26