H.264 to VC 1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Texas @ Arlington [email_address]   Guided by Dr. KR Rao
What is… H.264 The new industry standard  Massive quality, Minimal files  Scalable from 3G to HD and Beyond  VC 1 Informal name of the SMPTE 421M video codec Standard initially developed by Microsoft  Supported standard for HD DVDs, Blu-ray Discs, and Windows Media Video
What is… Transcoding  Converting a previously compressed video signal into another one with different format Change in bit rate, frame rate, frame size, or even compression standard  2 Ways Decode fully and encode in target standard Change the bit stream format from one standard to another without its undergoing the complete decoding and encoding process.  Limitations Compression artifacts are cumulative
Why Trancode H.264 to VC-1? The two high definition DVD formats HD-DVD and Blue ray have mandated MPEG-2, H.264 and VC-1 as video compression formats As H.264 based and VC-1 based content and products become available, transcoding in both directions will become widely used capabilities.  From an end user point of view, any VC-1 decoder can now become twice as powerful as it was earlier.
Why VC1? Requires less computational power and can be decoded at full 1080i/p resolution on today’s off-the-shelf PC Advanced Profile delivers compression efficiencies far superior to MPEG-2 Delivers HD content at bit rates as low as 6-8 Mbps Better visual quality against H.264 and MPEG-2 demonstrated in independent tests
More of VC1… DCT-based video codec design  Coding tools for interlaced video sequences as well as progressive encoding 8-bit, 4:2:0 format Uses block based transform and motion compensation with quantization and entropy coding.
Decoder – Simple & Main profile
Decoder – Advanced Profile
Block Transforms (Integer DCT) 8x8 blocks can be encoded using  1_8x8  2_8x4 2_4x8  4_4x4  Frame / Macroblock/Block signaling Block level for coarse and fine level specification  Frame level for overhead reduction  Only 8x8 used for I frames
8x8 Integer DCT Matrices WMV 9 H.264 HP
Key features of the Transforms The norms of vectors of the ratio 288:289:299 The variation in the norm accounted for in the encoder itself At the decoder, inverse transform (rows) -> rounding-> inverse transform (columns) ->rounding (to operate in the 16 bit range)
Quantization Same rule applied to all block sizes Both types with (bit savings at low bit rates) and without dead zone available Type used signaled at the frame level to the decoder At the encoder side automatic switch from uniform quantization to dead zone quantization as Q – parameter increases Other factors like noise and rate control can be used to control this switch
Loop Filtering Done to remove blocky artifacts and thus quality of current frame for future prediction Operates on pixels on the border of blocks The process involves a discontinuity measurement  Checks are computationally expensive so done only for one set of pixel per boundary
Motion Estimation and Compensation Max resolution of ¼ pixel (i.e. ¼, ½, ¾) allowed  16x 16 motion vectors used by default but 8x8 allowed Bicubic filter with 4 taps/ Bilinear filters with 2 taps to generate subpixel precision. 4 combined modes 1.Mixed block size (16x16 and 8x8), ¼ p ,bicubic 2.16x16, ¼ p, bicubic 3.16x16, ½ p, bicubic 4.16x16, ½ p, bilinear Bilinear filters for chroma components
Advanced entropy coding Simple VLC codes Multiple code tables for encoding each particular alphabet out  A possible set of code tables is chosen (based on frame level quantization parameter) and signaled in the bitstream Additional information e.g. motion vectors resolution coded using bitplane coding
Interlaced   coding Supports field and frame coding
Advanced B frame coding B frames:- employ bi-directional prediction Fractional position definition with respect to the reference frames for better scaling of motion vectors Intra coded B frames for scene changes Allow inter field reference
Overlap smoothing The deblocking filter smoothens out the block as well as true edges and it may be disabled in less complex profiles A lapped transform (input spans to pixels from other blocks as well) is used at the edges Used in spatial domain as pre and post processing Used only at low bit rates where blocking artifacts are higher Signaled at macroblock level so can be turned off in smooth regions
Low rate tools (<100 Kbps) Code frames at multiple resolutions (both in X and Y direction)  A frame can be downscaled at the encoder and then upscaled at the decoder for LBR transmission The downscaling factor needs to remain same from the start of 1 I frame to the start of next I frame. The frame must be upscaled prior to display (upscaling out of scope of the standard).
Fading compensation Large amount of bits required for scenes having effects like fade-to-black ,fade-from-black Not possible to predict motion using normally used techniques. Effective fading detection (original reference image- current video image > threshold = fading) If detected then encoder computes fading parameters which specify a pixel-wise first order linear transform for the reference image.  Also signaled to the decoder
Profiles Advanced Main Simple Yes No No Display metadata Yes No No GOP Layer Yes No No Field and frame coding modes Yes Yes No Range adjustment Yes Yes No Intensity compensation Yes Yes No B frames Yes Yes No Adaptive macroblock quantization Yes Yes No Dynamic resolution change Yes Yes No Loop filter Yes Yes No Extended motion vectors Yes Yes No Start codes Yes Yes No ¼ pixel chrominance motion compensation Yes Yes Yes ¼ pixel luminance motion compensation Yes Yes Yes 4 motion vector per macroblock Yes Yes Yes Overlapped transform Yes Yes Yes 16-bit transform Yes Yes Yes Variable-sized transform Yes Yes Yes Baseline intra frame compression Advanced Main Simple
Comparison of H.264 and VC-1  Free. Reference encoder and decoder free as well. Plus JVT, M4IF mail- lists where one may receive answers on the AVC related questions. Not free. Reference  decoder, which is not free by itself, comes with external documentation. FFMPeg project gives a free decoder. Documentation Similar Licensing costs Supports studio archiving requirements with 4:4:4 color space; separate black and white (BW) video mode Supports 4:2:0 compression / color space Example industry use Designed to meet a variety of industry needs with many profiles and levels, allowing for varying compression, quality and CPU usage levels, where the lowest level is for portable devices, designed with low CPU usage in mind, while the high levels are designed with very high quality and compression efficiency in mind Designed to offer very high image quality with excellent compression efficiency Goals H.264 VC 1 Overview
Comparison of H.264 and VC-1 6-tap filter for half pixel, averaging for quarter pixels bicubic, bilinear Sub pixel Interpolation methods Yes No B frame used for predicting other pictures Contiguous/non contiguous Contiguous (integer number of macro block rows only) Slice Yes VLC Only in High profile and above Yes Variable transform Size Only supported in Main and higher profiles No CABAC In-loop only In-loop and out-of-loop algorithms, overlap transform Deblocking filter SPS (sequence parameter set), PPS (picture parameters set), slice header, macroblock In advanced profile each Bit stream Data Unit has its own header. Simple and Main profile do not provide neither sequence nor entry point headers. Bitstream  format NAL and byte stream single bit stream Bitstream  formats H.264 VC-1 Features
Comparison of H.264 and VC-1 No data is encoded for macroblock Skipped Mb A field or frame Picture Two dimensional vector offset from current position to reference frame Motion vector 16x16 only Macroblock sizes Used for progressive or interlaced content Used for interlace content. Consists of bottom and top field Frame 4x4; 8x8 available in High Profile only 8x8, 4x8, 8x4, and 4x4 Integer transform 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, and 4x4 16x16, 16x8, 8x16, and 8x8 Partition sizes H.264 VC-1 Feature
Graphically…  8x8, 4x8, 8x4, 4x4 adaptive block size Frequency-independent dequantization scaling VLC-based entropy coding 4 tap bicubic filters for MC Relatively-simple loop filter Overlap intra filtering Range reduction/expansion Resolution red./exp. 8x8 and 4x4 adaptive block size Frequency-dependent dequantization matrix CABAC or VLC Long filters for MC Complex loop filter Spatial intra prediction Multi-picture arbitrary-order referencing Intra PCM VC-1 H.264 Block motion 16-bit integer transforms Bit-exact spec Fading prediction Loop filter
Transcoding point of view Adaptive In High profile
Stepping forward… Algorithm to deduce the picture type in VC-1 from H.264 picture types Algorithm to effectively handle transform size mismatch between H.264 and VC-1 Algorithm to choose the best reference picture of H.264 to be used for MC in VC-1
References An efficient algorithm for VC-1 to H.264 video transcoding in progressive compression -  Jae-Beom Lee   and Hari Kalva http://www.avsforum.com/avs-vb/showthread.php?p =9931723&&#post9931723 http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx   Windows Media Video 9: overview and applications Sridhar Srinivasan, Pohsiang (John) Hsu, TomHolcom b, Kunal Mukerjee, Shankar L. Regunathan, Bruce Lin, Jie Liang, Ming-Chieh Lee, Jordi Ribas-Corbera Windows Digital Media Division, Microsoft Corporation, Redmond, WA 98052, USA, available online at  www.sciencedirect.com
Thank You Vidhya Vijayakumar [email_address]

PPT

  • 1.
    H.264 to VC1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Texas @ Arlington [email_address] Guided by Dr. KR Rao
  • 2.
    What is… H.264The new industry standard Massive quality, Minimal files Scalable from 3G to HD and Beyond VC 1 Informal name of the SMPTE 421M video codec Standard initially developed by Microsoft Supported standard for HD DVDs, Blu-ray Discs, and Windows Media Video
  • 3.
    What is… Transcoding Converting a previously compressed video signal into another one with different format Change in bit rate, frame rate, frame size, or even compression standard 2 Ways Decode fully and encode in target standard Change the bit stream format from one standard to another without its undergoing the complete decoding and encoding process. Limitations Compression artifacts are cumulative
  • 4.
    Why Trancode H.264to VC-1? The two high definition DVD formats HD-DVD and Blue ray have mandated MPEG-2, H.264 and VC-1 as video compression formats As H.264 based and VC-1 based content and products become available, transcoding in both directions will become widely used capabilities. From an end user point of view, any VC-1 decoder can now become twice as powerful as it was earlier.
  • 5.
    Why VC1? Requiresless computational power and can be decoded at full 1080i/p resolution on today’s off-the-shelf PC Advanced Profile delivers compression efficiencies far superior to MPEG-2 Delivers HD content at bit rates as low as 6-8 Mbps Better visual quality against H.264 and MPEG-2 demonstrated in independent tests
  • 6.
    More of VC1…DCT-based video codec design Coding tools for interlaced video sequences as well as progressive encoding 8-bit, 4:2:0 format Uses block based transform and motion compensation with quantization and entropy coding.
  • 7.
    Decoder – Simple& Main profile
  • 8.
  • 9.
    Block Transforms (IntegerDCT) 8x8 blocks can be encoded using 1_8x8 2_8x4 2_4x8 4_4x4 Frame / Macroblock/Block signaling Block level for coarse and fine level specification Frame level for overhead reduction Only 8x8 used for I frames
  • 10.
    8x8 Integer DCTMatrices WMV 9 H.264 HP
  • 11.
    Key features ofthe Transforms The norms of vectors of the ratio 288:289:299 The variation in the norm accounted for in the encoder itself At the decoder, inverse transform (rows) -> rounding-> inverse transform (columns) ->rounding (to operate in the 16 bit range)
  • 12.
    Quantization Same ruleapplied to all block sizes Both types with (bit savings at low bit rates) and without dead zone available Type used signaled at the frame level to the decoder At the encoder side automatic switch from uniform quantization to dead zone quantization as Q – parameter increases Other factors like noise and rate control can be used to control this switch
  • 13.
    Loop Filtering Doneto remove blocky artifacts and thus quality of current frame for future prediction Operates on pixels on the border of blocks The process involves a discontinuity measurement Checks are computationally expensive so done only for one set of pixel per boundary
  • 14.
    Motion Estimation andCompensation Max resolution of ¼ pixel (i.e. ¼, ½, ¾) allowed 16x 16 motion vectors used by default but 8x8 allowed Bicubic filter with 4 taps/ Bilinear filters with 2 taps to generate subpixel precision. 4 combined modes 1.Mixed block size (16x16 and 8x8), ¼ p ,bicubic 2.16x16, ¼ p, bicubic 3.16x16, ½ p, bicubic 4.16x16, ½ p, bilinear Bilinear filters for chroma components
  • 15.
    Advanced entropy codingSimple VLC codes Multiple code tables for encoding each particular alphabet out A possible set of code tables is chosen (based on frame level quantization parameter) and signaled in the bitstream Additional information e.g. motion vectors resolution coded using bitplane coding
  • 16.
    Interlaced coding Supports field and frame coding
  • 17.
    Advanced B framecoding B frames:- employ bi-directional prediction Fractional position definition with respect to the reference frames for better scaling of motion vectors Intra coded B frames for scene changes Allow inter field reference
  • 18.
    Overlap smoothing Thedeblocking filter smoothens out the block as well as true edges and it may be disabled in less complex profiles A lapped transform (input spans to pixels from other blocks as well) is used at the edges Used in spatial domain as pre and post processing Used only at low bit rates where blocking artifacts are higher Signaled at macroblock level so can be turned off in smooth regions
  • 19.
    Low rate tools(<100 Kbps) Code frames at multiple resolutions (both in X and Y direction) A frame can be downscaled at the encoder and then upscaled at the decoder for LBR transmission The downscaling factor needs to remain same from the start of 1 I frame to the start of next I frame. The frame must be upscaled prior to display (upscaling out of scope of the standard).
  • 20.
    Fading compensation Largeamount of bits required for scenes having effects like fade-to-black ,fade-from-black Not possible to predict motion using normally used techniques. Effective fading detection (original reference image- current video image > threshold = fading) If detected then encoder computes fading parameters which specify a pixel-wise first order linear transform for the reference image. Also signaled to the decoder
  • 21.
    Profiles Advanced MainSimple Yes No No Display metadata Yes No No GOP Layer Yes No No Field and frame coding modes Yes Yes No Range adjustment Yes Yes No Intensity compensation Yes Yes No B frames Yes Yes No Adaptive macroblock quantization Yes Yes No Dynamic resolution change Yes Yes No Loop filter Yes Yes No Extended motion vectors Yes Yes No Start codes Yes Yes No ¼ pixel chrominance motion compensation Yes Yes Yes ¼ pixel luminance motion compensation Yes Yes Yes 4 motion vector per macroblock Yes Yes Yes Overlapped transform Yes Yes Yes 16-bit transform Yes Yes Yes Variable-sized transform Yes Yes Yes Baseline intra frame compression Advanced Main Simple
  • 22.
    Comparison of H.264and VC-1 Free. Reference encoder and decoder free as well. Plus JVT, M4IF mail- lists where one may receive answers on the AVC related questions. Not free. Reference decoder, which is not free by itself, comes with external documentation. FFMPeg project gives a free decoder. Documentation Similar Licensing costs Supports studio archiving requirements with 4:4:4 color space; separate black and white (BW) video mode Supports 4:2:0 compression / color space Example industry use Designed to meet a variety of industry needs with many profiles and levels, allowing for varying compression, quality and CPU usage levels, where the lowest level is for portable devices, designed with low CPU usage in mind, while the high levels are designed with very high quality and compression efficiency in mind Designed to offer very high image quality with excellent compression efficiency Goals H.264 VC 1 Overview
  • 23.
    Comparison of H.264and VC-1 6-tap filter for half pixel, averaging for quarter pixels bicubic, bilinear Sub pixel Interpolation methods Yes No B frame used for predicting other pictures Contiguous/non contiguous Contiguous (integer number of macro block rows only) Slice Yes VLC Only in High profile and above Yes Variable transform Size Only supported in Main and higher profiles No CABAC In-loop only In-loop and out-of-loop algorithms, overlap transform Deblocking filter SPS (sequence parameter set), PPS (picture parameters set), slice header, macroblock In advanced profile each Bit stream Data Unit has its own header. Simple and Main profile do not provide neither sequence nor entry point headers. Bitstream format NAL and byte stream single bit stream Bitstream formats H.264 VC-1 Features
  • 24.
    Comparison of H.264and VC-1 No data is encoded for macroblock Skipped Mb A field or frame Picture Two dimensional vector offset from current position to reference frame Motion vector 16x16 only Macroblock sizes Used for progressive or interlaced content Used for interlace content. Consists of bottom and top field Frame 4x4; 8x8 available in High Profile only 8x8, 4x8, 8x4, and 4x4 Integer transform 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, and 4x4 16x16, 16x8, 8x16, and 8x8 Partition sizes H.264 VC-1 Feature
  • 25.
    Graphically… 8x8,4x8, 8x4, 4x4 adaptive block size Frequency-independent dequantization scaling VLC-based entropy coding 4 tap bicubic filters for MC Relatively-simple loop filter Overlap intra filtering Range reduction/expansion Resolution red./exp. 8x8 and 4x4 adaptive block size Frequency-dependent dequantization matrix CABAC or VLC Long filters for MC Complex loop filter Spatial intra prediction Multi-picture arbitrary-order referencing Intra PCM VC-1 H.264 Block motion 16-bit integer transforms Bit-exact spec Fading prediction Loop filter
  • 26.
    Transcoding point ofview Adaptive In High profile
  • 27.
    Stepping forward… Algorithmto deduce the picture type in VC-1 from H.264 picture types Algorithm to effectively handle transform size mismatch between H.264 and VC-1 Algorithm to choose the best reference picture of H.264 to be used for MC in VC-1
  • 28.
    References An efficientalgorithm for VC-1 to H.264 video transcoding in progressive compression - Jae-Beom Lee and Hari Kalva http://www.avsforum.com/avs-vb/showthread.php?p =9931723&&#post9931723 http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx Windows Media Video 9: overview and applications Sridhar Srinivasan, Pohsiang (John) Hsu, TomHolcom b, Kunal Mukerjee, Shankar L. Regunathan, Bruce Lin, Jie Liang, Ming-Chieh Lee, Jordi Ribas-Corbera Windows Digital Media Division, Microsoft Corporation, Redmond, WA 98052, USA, available online at www.sciencedirect.com
  • 29.
    Thank You VidhyaVijayakumar [email_address]