Google VP8


Published on

Review of Google's latest video codec - VP8. its mnain features and comparison to H.264 in terms of quality

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Google VP8

  1. 1. VP8 for Embedded Video VP8
  2. 2. Embedded Video Resolutions x4 pixels Source: Wikipedia
  3. 3. Challenges <ul><li>Memory </li></ul><ul><li>Bus bandwidth </li></ul><ul><li>MIPS ! </li></ul>
  4. 4. Video coding <ul><li>Till recently the only video codec used for both video conferencing and VoD was H.264 </li></ul><ul><li>Even in lower profiles H.264 is very CPU intensive. </li></ul><ul><li>Lower CPU codecs like H.263 or sorenson spark provide lower quality/bitrate ratio </li></ul>
  5. 5. Google VP8 <ul><li>Last month, in Google IO (developer conference), Google released VP8 as open source </li></ul><ul><li>VP8 is a light weight video codec developed by On2. </li></ul><ul><li>VP8 provide quality which is higher than H.264 base profile </li></ul><ul><li>VP8 memory requirements are lower than H.264 base profile </li></ul><ul><li>After optimization, VP8 will have better MIPS performance than H.264 base profile </li></ul>
  6. 6. Genealogy <ul><li>VP8 is part of a well know codec family </li></ul><ul><li>VP3 was released to open source to become XIPH Theora </li></ul><ul><li>VP6 is used in Flash video </li></ul><ul><li>VP7 is used in Skype </li></ul><ul><li>Motivation: </li></ul><ul><ul><li>“ No Royalties” CODEC </li></ul></ul>VP8 VP7 VP6 VP3 Theora
  7. 7. ADAPTATION – WHO USE IT? <ul><li>Software </li></ul><ul><li>Hardware </li></ul><ul><li>Platform & Publishers </li></ul>
  8. 8. Software Adaptation <ul><li>Android, Anystream, Collabora </li></ul><ul><li>Corecodec, Firefox, Adobe Flash </li></ul><ul><li>Google Chrome, iLinc, </li></ul><ul><li>Inlet, Opera, ooVoo </li></ul><ul><li>Skype, Sorenson Media </li></ul><ul><li>, Telestream, Wildform. </li></ul>
  9. 9. Hardware adaptation <ul><li>AMD, ARM, Broadcom </li></ul><ul><li>Digital Rapids, Freescale </li></ul><ul><li>Harmonic ,Logitech, ViewCast </li></ul><ul><li>Imagination Technologies, Marvell </li></ul><ul><li>NVIDIA, Qualcomm, Texas Instruments </li></ul><ul><li>VeriSilicon, MIPS </li></ul>
  10. 10. Platforms and Publishers <ul><li>Brightcove </li></ul><ul><li> </li></ul><ul><li>HD Cloud </li></ul><ul><li>Kaltura </li></ul><ul><li>Ooyala </li></ul><ul><li>YouTube </li></ul><ul><li>Zencoder </li></ul>
  12. 12. Adaptive Loop Filter <ul><li>Improved Loop filter provides better quality & preformance in comparison to H.264 </li></ul>Source: On2
  13. 13. Golden Frames <ul><li>Golden frames enables better decoding of background which is used for prediction in later frames </li></ul><ul><li>Could be used as resync-point: </li></ul><ul><ul><li>Golden frame can reference an I frame </li></ul></ul><ul><li>Could be hidden (not for display) </li></ul>Source: On2
  14. 14. Decoding efficiency <ul><li>CABAC is an H.264 feature which improves coding efficiency but consumes many CPU cycles </li></ul><ul><li>VP8 has better entropy coding than H.264, this leads to relatively lower CPU consumption under the same conditions </li></ul><ul><li>Decoding efficiency is important for smooth operation and long battery life in netbooks and mobile devices </li></ul>Source: On2
  15. 15. Resolution up-scaling & downscaling <ul><li>Supported by the decoder </li></ul><ul><li>Encoder could decide dynamically (RT applications) to lower resolution in case of low bit rate and let the decoder scale. </li></ul><ul><li>Remove decision from the application </li></ul><ul><li>No need for an I frame </li></ul>
  16. 16. VP8 BASICS <ul><li>Definitions </li></ul><ul><li>Bitstream structure </li></ul><ul><li>Frame structure </li></ul>
  17. 17. Definitions <ul><li>Frame – same as H.264 </li></ul><ul><li>Segment – Parallel to slice in H.264. MB in the same segment will use the settings such as: </li></ul><ul><ul><li>Probabilistic encoder/decoder settings </li></ul></ul><ul><ul><li>De-blocking filter settings </li></ul></ul><ul><li>Partition – block of byte aligned compressed video bits. </li></ul>
  18. 18. Definitions <ul><li>Block – 8x8 matrix of pixels </li></ul><ul><li>Macro-block –processing unit, contains a 16x16 Y pixels, and 2 8x8 matrix of U and V: </li></ul><ul><ul><li>4* 8x8Y block </li></ul></ul><ul><ul><li>1* 8x8U block </li></ul></ul><ul><ul><li>1* 8x8V block </li></ul></ul><ul><li>Sub-block – 4x4 matrix of pixels. All DCT / WHT operations are done on sub-blocks. </li></ul>
  19. 19. Frame Types <ul><li>I Frame </li></ul><ul><li>P Frame </li></ul><ul><li>No B Frames due to patents / delays </li></ul><ul><li>Prediction </li></ul><ul><ul><li>Previous frame </li></ul></ul><ul><ul><li>“ Golden Frame” </li></ul></ul><ul><ul><li>Alt-ref frame </li></ul></ul>
  20. 20. Frame Structure <ul><li>Include three sections: </li></ul><ul><li>Frame Header </li></ul><ul><li>Partition I </li></ul><ul><li>Partition II </li></ul>Frame Header Partition I Partition II partitions
  21. 21. Frame Header <ul><li>Byte aligned uncompressed information </li></ul><ul><li>Frame type - 1-bit frame type </li></ul><ul><ul><li>0 for key frames, 1 for inter-frame. </li></ul></ul><ul><li>Level - A 3-bit version number </li></ul><ul><ul><li>0 - 3 are defined as four different profiles with different decoding complexity; other values for future use </li></ul></ul><ul><li>show_frame - A 1-bit show_frame flag </li></ul><ul><ul><li>0 – current frame not for display </li></ul></ul><ul><ul><li>1 - current frame is for display </li></ul></ul><ul><li>Length - A 19-bit field containing the size of the first data partition in bytes. </li></ul>
  22. 22. Partition I <ul><li>Partition I </li></ul><ul><ul><li>Header information for the entire frame </li></ul></ul><ul><ul><li>Per-macroblock information specifying how each macroblock is predicted. This information is presented in raster-scan order </li></ul></ul>
  23. 23. Partition II <ul><li>Texture information - DCT/WHT quantized coefficients </li></ul><ul><li>Optionally each macroblock row could be mapped to a separate partition. </li></ul><ul><li>Partition II might be divided to several partitions for parallel processing </li></ul>Frame Header Partition I Texture Data Partition IIA Partition IIB Partition IIn
  24. 24. Decoder <ul><li>Holds 4 frames: </li></ul><ul><ul><li>Current remonstrated frame </li></ul></ul><ul><ul><li>Previous frame </li></ul></ul><ul><ul><li>Previous “Golden Frame” </li></ul></ul><ul><ul><li>Previous Alt-ref frame </li></ul></ul><ul><li>Frame dimension can change in every frame </li></ul>
  25. 25. VP8 block diagram Entropy Coding Scaling & Inv. Transform Motion- Compensation Control Data Quant. Transf. coeffs Motion Data Intra/Inter Coder Control Decoder Motion Estimation Transform/ Scal./ Quant . - Input Video Signal Split into Macroblocks Intra-frame Prediction Dynamic De-blocking Output Video
  26. 26. VP8 BLOCK CODING
  27. 27. VP8 Macroblock coding Divide to 16x16 Macroblock Divide to 8x8 blocks Process as 4x4 sub blocks 4x4 WHT 4x4 DCT <ul><li>Each Macroblock is divided into 25 sub-blocks </li></ul><ul><li>6 Y sub-blocks </li></ul><ul><li>4 U sub-blocks, </li></ul><ul><li>4 V sub-blocks </li></ul><ul><li>1 Y2 DC values sub-block (WHT) </li></ul>DC/AC Coeff
  28. 28. DCT & iDCT <ul><li>Very inefficient – uses 16bit multiplaction in decoder </li></ul><ul><li>Uses exact values of pixels </li></ul><ul><ul><li>+Memory </li></ul></ul><ul><ul><li>+Accuracy and no drift </li></ul></ul>static const int cospi8sqrt2minus1  = 20091; //sqrt(2) * cos(pi/8) static const int sinpi8sqrt2       = 35468; //sqrt(2) * sin (pi/8) temp1 = (ip[4] * sinpi8sqrt2 + rounding) >> 16;
  29. 29. Quantization <ul><li>There are 6 quantizers each has its own levels </li></ul><ul><li>The quantizer depends on (multiplication of) </li></ul><ul><ul><li>Plane: Y,U, V </li></ul></ul><ul><ul><li>Coefficient AC, DC </li></ul></ul><ul><li>Quantizer level is indicated by a 7 digit number which is an entry into one of the 6 quantization levels </li></ul>
  30. 30. VP8 PREDICTION <ul><li>Inter-prediction </li></ul><ul><li>Intra prediction </li></ul>
  31. 31. Macroblock Intra Prediction <ul><li>Intra-prediction exploits the spatial coherence between Macro-blocks without referring to other frames. </li></ul><ul><li>Modes </li></ul><ul><ul><li>Same as H.264 in i16x16 and i4x4 </li></ul></ul><ul><ul><li>Missing modes like i8x8 which exists in H.264 </li></ul></ul>
  32. 32. Intra prediction - blocks used Not Relevant M Not Available Not Available Not Available Not Available Not Available Not Available
  33. 33. Inter-frame prediction - Chroma <ul><li>Chroma prediction - motion vector for each 8X8 chroma block is calculated separately by one of four prediction methods listed below: </li></ul><ul><ul><li>Vertical - Copying the row from above throughout the prediction buffer. </li></ul></ul><ul><ul><li>Horizontal - Copying the column from left throughout the prediction buffer. </li></ul></ul><ul><ul><li>DC - Copying the average value of the row and column throughout the prediction buffer. </li></ul></ul><ul><ul><li>Extrapolation from the row and column using the (fixed) second difference (horizontal and vertical) from the upper left corner. </li></ul></ul>
  34. 34. 8x8 Chroma prediction modes <ul><li>U,V, Y prediction are done separately and one channel prediction does not affect the other channels. </li></ul>
  35. 35. i4x4 Prediction <ul><li>4x4 block are predicated by </li></ul><ul><ul><li>four 16x16 prediction methods </li></ul></ul><ul><ul><li>six “diagonal” prediction methods </li></ul></ul>Diagonal Down/left Vertical-right Horizontal-down Vertical-left Horizontal-top Diagonal Down/right
  36. 36. Inter-frame prediction - Luma <ul><li>Definition - Inter-prediction exploits the temporal coherence between frames to save bitrate. </li></ul><ul><li>Luma sub-block prediction </li></ul><ul><ul><li>Method - each Y 4x4 sub-blocks is related to a 4x4 sub-block of the prediction frame. </li></ul></ul><ul><ul><li>Precision – motion vectors precision is q-pel. </li></ul></ul><ul><ul><li>interpolation pixel is calculated by applying a kernel filter three pixels horizontally and vertically. </li></ul></ul>
  37. 37. Inter-frame Prediction - Chroma <ul><li>Chroma precision - the calculated chroma motion vectors have 1/8 pixel resolution </li></ul><ul><li>averaging the vectors of the four Y sub-blocks that occupy the same area of the frame. </li></ul>
  39. 39. Talking heads, Low motion <ul><li>Low motion videos like talking heads are easy to compress, so you'll see no real difference </li></ul>Source Jan Ozer Streaming Media
  40. 40. Low motion <ul><li>In another low motion video with a terrible background for encoding (finely detailed wallpaper), the VP8 video retains much more detail than H.264. Interesting result. </li></ul>
  41. 41. Medium motion <ul><li>VP8 holds up fairly well </li></ul>
  42. 42. High motion <ul><li>In high motion videos, H.264 seems superior. In this sample, blocks are visible in the pita where the H.264 video is smooth. The pin-striped shirt in the right background is also sharper in the H.264 video, as is the striped shirt on the left. </li></ul>
  43. 43. Very High motion <ul><li>In this very high motion skateboard video, H.264 also looks clearer, particularly in the highlighted areas in the fence, where the VP8 video has artifacts. </li></ul>
  44. 44. Final <ul><li>In the final comparison, I'd give a slight edge to VP8, which was clearer and showed fewer artifacts. </li></ul>
  45. 45. Quality Comparison Source Jan Ozer Streaming Media
  46. 46. DSP-IP Contact information Download slides at: Courses & lecture request Projects development services: Adi Yakov Training Manager [email_address] +972-9-8651933 Mail : [email_address] Phone: +972-9-8850956, Fax : +972-50- 8962910 <ul><ul><li>Alona Ashkenazi </li></ul></ul><ul><li>Development Services </li></ul><ul><li>[email_address] +972-9-8850956 </li></ul>