H264 video coding
Upcoming SlideShare
Loading in...5

Like this? Share it with your network

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • Group of Picture (GOP) a group of pictures, or GOP, specifies the order in which intra-frames and inter frames are arranged. A GOP can contain the following picture types: I-picture or I-frame (intra coded picture) reference picture, corresponds to a fixed image and is independent of other picture types. Each GOP begins with this type of picture. P-picture or P-frame (predictive coded picture) contains motion compensated difference information from the preceding I- or P-frame. B-picture or B-frame (bidirectionally predictive coded picture) contains difference information from the preceding and following I- or P-frame within a GOP . A GOP always begins with an I-frame. Afterwards several P-frames follow, in each case with some frames distance. In the remaining gaps are B-frames. Some video codecs allow for more than one I-frame in a GOP. The GOP structure is often referred by two numbers, for example M=3, N=12. The first one tells the distance between two anchor frames (I or P). The second one tells the distance between two full images (I-frames), it is the GOP length . For the above example, the GOP structure is IBBPBBPBBPBB. Instead of the M parameter one can use the maximal count of B-frames between two consecutive anchor frames. The more I-frames the MPEG stream has, the more it is editable. However, having more I-frames increases the stream size. In order to save bandwidth and disk space, videos prepared for internet broadcast often have only one I-frame per GOP.


  • 1. H.264 Subhrendu Sarkar Computer Science, Columbia University COMS W4995 - VOIP Security9/29/2008 1
  • 2. • Introduction• Video Formats and Quality• Video Coding and H.264• Performance• Conclusion• References9/29/2008 2
  • 3. Introduction• What is H.264 ? – H.264 is a Video Coding Standard also known as MPEG-4 Part-10 (AVC).• H.261, H.263, MPEG-1, MPEG-2 are some predecessors of H.264.• Video Purpose of a standard – Define a coded representation (or syntax) that describes visual data in a compressed from Latin “I see” form and method of decoding the syntax to reconstruct visual information. – compliant encoders and decoders can successfully interoperate with each other.• A Video standard specifically do not define an encoder; rather, they define the output that an encoder should produce.• A decoding method is defined in each standard.9/29/2008 3
  • 4. • Introduction• Video Formats and Quality9/29/2008 4
  • 5. • Pixel (Picture Element)• Interlaced Video (Frames and Fields)• Bitrate and Frame rate. – Typically 30 fps is good for human visual system. – Higher the bitrate, better the quality of9/29/2008 5 video.
  • 6. • Video Formats – NTSC , PAL (analogue video) • PAL (Europe, Asia, Australia, etc.) 25 frames/sec • SECAM (France, Russia, parts of Africa etc.) 25 frames/sec • NTSC (USA, Canada, Japan, etc.) 29.97 frames/sec – According to resolution • VGA 640x480 (Video Graphics Array ), QVGA • CIF 352x288 CIF ( Common Intermediate Format ) – QCIF, SQCIF • SDTV (e.g 720 x 480) • HDTV (e.g 1920×1080 )• Color Spaces – RGB (Red, Green, Blue) – YUV also known as YC C (luminance, b r chroma) • Y = k r R + k gG + k bB9/29/2008 6
  • 7. • YUV Sampling Formats – YUV 444 – YUY2 (4:2:2) – YV12 or YUV420 (4:2:0) Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson9/29/2008 7
  • 8. Video Quality• Subjective Video Quality• Objective Video Quality – PSNR (Peak Signal to Noise Ratio) measured on a logarithmic scale and depends on the mean squared error (MSE) of between an original and an impaired Image or video frame, relative to (2 n −1) 2 (the square of the highest possible signal value in the image, where n is the number of bits per image sample).9/29/2008 8
  • 9. IntroductionVideo Formats and QualityVideo Coding and H.2649/29/2008 9
  • 10. • A video CODEC encodes a source image or video sequence into a compressed form and decodes this to produce a copy or approximation of the source sequence. Three main functional units of a Video Encoder • Spatial Model • Temporal Model • Entropy Encoder Courtesy : Images from H.264 and MPEG-4 Compression 10 Richardson9/29/2008 – Ian
  • 11. • Macroblock, Block and Sub-Block. – 16 x16 Macroblocks. – 16 x 8, 8 x16, 8x8, 8x4, 4x8, 4x4 Blocks.• Temporal Model – Prediction from the Previous Video Frame • Optical Field Flow • Block based Motion Estimation and Motion Compensation Ref Block Ref Block Current Macroblock9/29/2008 11
  • 12. Motion vectorsSearch region, finds best matching MB D B C• mv : mvx and mvy (distance between current block and the ref macroblock. mvp mvp mvp• mvp - motion vector predictor A• mvd - motion vector difference mvd mvp• mvd = Difference (mvp, mv) Courtesy : Diagram from http://wiki.multimedia.cx/index.php?title=Motion_Prediction9/29/2008 12
  • 13. 16x16 Motion Frame Fn Frame Fn-1 Vectors Residual Fn – Fn-1 Motion Compensated Motion Compensated Reference Residual9/29/2008 13 Courtesy : Images from H.264 and MPEG-4 Compression – Ian
  • 14. • I Frames, P Frames, B Frames – I Frames – Spatial prediction only for all MBs – P Frames – has I (intra) spatially predicted MB and P (inter) temporally predicted MBs. – B Frames – Bi-directionally predicted frames.• Group of Pictures (GOP) – Display Order Frame No : 0 1 2 3 4 5 6 7 8 9 Frame Type : I B B P B B P B B I ... – Encoding of Decoding Order Frame No : 0 3 1 2 6 4 5 7 8 99/29/2008 14
  • 15. • Spatial Model – Intra Macroblocks. – Spatial Correlation between Macroblocks• Temporal Model – Inter macroblocks – Temporal Correlation between Macroblocks – Searching for similar macroblocks from reference frames.• Transform – Time to frequency domain. – Discrete Cosine Transform is used. – Theoretically not lossy. Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian Richardson9/29/2008 15
  • 16. • Quantization – Basically dividing the transformed coefficients by quant values in the encoder and multiplying by the quant value in the decoder. – Lossy – Helps meet bitrate constraints. – Human eye is less sensitive to higher frequency transform coefficients.• Entropy Coder – Reorder (Zig-Zag Scan order) – Variable length coding • Run Length Coding • Huffman Coding • Arithmetic Coding • H.264 – Context Adaptive Variable Length Coding. (CAVLC) – Context Adaptive Binary Arithmetic Coding.9/29/2008 (CABAC) 16
  • 17. Video Encoder In Loop Deblocking Filter Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian9/29/2008 17 Richardson
  • 18. Video Decoder Deblocking Filter Courtesy : Diagrams from H.264 and MPEG-4 Compression – Ian Richardson9/29/2008 18
  • 19. Deblocking Filter Non Deblocked Image Deblocked Image Courtesy : Images from http://compression.ru/video/deblocking/9/29/2008 19
  • 20. H.264 Profiles Courtesy : Diagram from H.264 and MPEG-4 Compression – Ian9/29/2008 20 Richardson
  • 21. • Introduction• Video Formats and Quality• Video Coding and H.264• Performance9/29/2008 21
  • 22. Sample Videos• H.263• MPEG-4 Basic• MPEG-4 Improved• MPEG-4 Part 10 (AVC) or H.264 QUALITY Compression Complexity9/29/2008 22 Courtesy : Videos from http://mac.sillydog.org/qt/compare.php
  • 23. Applications Courtesy : Table from H.264 and MPEG-4 Compression – Ian9/29/2008 23 Richardson
  • 24. Conclusion• H.264 is a Digital Video Coding Standard• H.264 gives best quality and compression when compared to earlier codec (s).• H.264 more complex and thus requires more processing.• Different Profiles cater to different kinds of applications.• Tradeoff between Quality of Video, Compression achieved and complexity of the codec9/29/2008 24
  • 25. References• H.264 And MPEG-4 Video Compression – Ian Richardson• ISO/IEC 14496-10 and ITU-T Rec. H.264, Advanced Video Coding, 2003.• H.264 Reference Software Version – http://iphome.hhi.de/suehring/tml/ – Current software version: JM 14.29/29/2008 25
  • 26. Appendix - H.264 Features• Weighted Prediction• Entropy Coding – CAVLC (Context Adaptive Variable Length Coding) – CABAC (Main profile) Context Adaptive Binary Arithmetic Coding9/29/2008 26
  • 27. Some H.264 Features• Reference Pictures• Slices• Integer and Sub-sample prediction Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson9/29/2008 27
  • 28. Appendix - Rate Control9/29/2008 28 Courtesy : Images from H.264 and MPEG-4 Compression – Ian
  • 29. Courtesy : Images from H.264 and MPEG-4 Compression – Ian Richardson9/29/2008 29