Chroma from Luma (CfL) in AV1
Status Update
Luc Trudeau
Alliance for Open Media Working Group, April 2017
Mozilla and the Xiph.Org Foundation
1
Chroma from Luma (CfL) moz://a
A coding tool that predicts information in the chromatic planes based on
previously encoded information in the luma plane.
Luminance (Luma) Blue-difference chroma plane (Cb) Red-difference chroma plane (Cr)
CfL assumes that luma and chroma are locally correlated, this can be
seen in the previous images1
1Notice the resemblance between luma and chroma
2
Proposed vs. prior CfLs moz://a
LM Mode2
Thor CfL3
Daala CfL4
Proposed
Prediction domain Spatial Spatial Frequency Spatial
Bitstream signaling No No Sign bit, Alpha
PVQ gain
Activation mechanism LM Mode, Threshold Signaled DC PRED
4 × 4 and 8 × 8
Requires PVQ No No Yes No
Encoder model fitting Yes Yes Via PVQ Yes
Decoder model fitting5
Yes6
Yes No No
2Chen et al. “Chroma Intra Prediction by Reconstructed Luma Samples”, JCTVC-E266,
http://phenix.int-evry.fr/jct/doc end user/documents/5 Geneva/wg11/JCTVC-E266-v4.zip (March 2011)
3Steinar Midtskogen, “Improved chroma prediction” IETF draft-midtskogen-netvc-chromapred-02
https://tools.ietf.org/html/draft-midtskogen-netvc-chromapred-02 (October 2016)
4Nathan E. Egge and Jean-Marc Valin, “Predicting Chroma from Luma with Frequency Domain Intra Prediction”,
https://people.xiph.org/~unlord/spie cfl.pdf (April 2016)
5When model fitting is performed in the decoder, the original chroma planes cannot be used.
6Complexity concerns related to model fitting during the decoding process was one of the reasons for the removal of LM Mode in HEVC.
3
Description of the Proposed CfL
Computing the CfL scaling factor (encoder only) moz://a
For an intra prediction block, where the uv mode is DC PRED, α is
computed using the reconstructed luma pixels, the original chroma pixels
and a prediction block level DC PRED7
◦ average
Downsampling8
−
◦ −
Least Squares
◦
Reconstructed luma pixels
Original Chroma pixels
Chroma DC PRED pixels
α
7DC PRED is computed over the entire prediction block size, not over the transform
size.
8This step is required if the sequence is not 4:4:4
4
CfL prediction (Encoder and Decoder) moz://a
The CfL prediction is computed using the reconstructed luma pixels9
, the
prediction block-level DC PRED and the dequantized α value.
◦ average
Downsampling10
−
◦ ×
+◦
Reconstructed luma pixels
Signaled α
Chroma DC PRED pixels
CfL prediction
9Resampling is required for 4:2:0 (4 adds and 1 shift per chroma pixel)
10This step is required if the sequence is not 4:4:4
5
Vector Quantization of α Cb and α Cr moz://a
Both chromatic αs are vector quantized. The vector quantization table11
is trained using k-means (Lloyd’s algorithm).
11The quantization table is normative, but could be added to the bitstream 6
Signaling moz://a
Both αCb and αCr are combined into a symbol of a 16 value CDF
The sign of each α is signaled using a bit12
◦
◦ ind = aom read symbol()
sign[0] = aom read bit()
sign[1] = aom read bit()
◦
block skipped mbmi→skipped == 0
cfl alpha codes[ind][0] != 0
cfl alpha codes[ind][1] != 0
12Both signs could be signaled using a single symbol
7
Results (AWCY) moz://a
AV1 with default experiments enabled vs. AV1 with CfL
Dataset: Subset 1 (50 4:2:0 still images)
BD-Rate with respect to
PSNR PSNR Cb PSNR Cr PSNR HVS SSIM MS SSIM CIEDE 200013
1.2589 -15.5532 -13.6577 1.4392 1.3205 1.2999 -4.4521
https://arewecompressedyet.com/?job=master%402017-03-27T18%3A41%3A56.236Z&job=
CfL Double DC PRED%402017-04-15T01%3A48
13CIEDE2000 is the only metric that considers both luma and chroma planes
8
Examples
AV1 moz://a
Plane: Blue-difference chroma plane (Cb)
PSNR Cb: 41.0378 dB
QP=55
Sequence: Washington Monument, Washington, D.C. 04037u original.y4m (subset1)
Analyzer link: https://goo.gl/69N6LC 9
AV1 + CfL moz://a
Plane: Blue-difference chroma plane (Cb)
PSNR Cb: 42.9614 dB
QP=55
Sequence: Washington Monument, Washington, D.C. 04037u original.y4m (subset1)
Analyzer link: https://goo.gl/69N6LC 10
AV1 moz://a
PSNR Cb: 40.9118 dB PSNR Cr: 41.3498 dB
QP=55
Sequence: US Navy 111117-N-UB993-082 A Sailor examines a patient during drill.y4m (subset1)
Analyzer link: https://goo.gl/NdLyzu
11
AV1 + CfL moz://a
PSNR Cb: 41.9660 dB PSNR Cr: 42.4291 dB
QP=55
Sequence: US Navy 111117-N-UB993-082 A Sailor examines a patient during drill.y4m (subset1)
Analyzer link: https://goo.gl/NdLyzu
12
Future Work
Work remaining before proposal for adoption moz://a
• Improve quantization tables and probability tables
• Optimize code book size
• Optimize encoder α selection to improve coding efficiency
• Use multisymbol to signal sign bits
• 4:2:2 support
13
Ongoing research moz://a
• Tuning the luma-chroma balance to, e.g., reduce or eliminate the
loss in luma PSNR (for a smaller CIEDE gain)
• Add the code book to the bitstream
• CfL in inter (similar to Thor)
14

Chroma From Luma Status Update

  • 1.
    Chroma from Luma(CfL) in AV1 Status Update Luc Trudeau Alliance for Open Media Working Group, April 2017 Mozilla and the Xiph.Org Foundation 1
  • 2.
    Chroma from Luma(CfL) moz://a A coding tool that predicts information in the chromatic planes based on previously encoded information in the luma plane. Luminance (Luma) Blue-difference chroma plane (Cb) Red-difference chroma plane (Cr) CfL assumes that luma and chroma are locally correlated, this can be seen in the previous images1 1Notice the resemblance between luma and chroma 2
  • 3.
    Proposed vs. priorCfLs moz://a LM Mode2 Thor CfL3 Daala CfL4 Proposed Prediction domain Spatial Spatial Frequency Spatial Bitstream signaling No No Sign bit, Alpha PVQ gain Activation mechanism LM Mode, Threshold Signaled DC PRED 4 × 4 and 8 × 8 Requires PVQ No No Yes No Encoder model fitting Yes Yes Via PVQ Yes Decoder model fitting5 Yes6 Yes No No 2Chen et al. “Chroma Intra Prediction by Reconstructed Luma Samples”, JCTVC-E266, http://phenix.int-evry.fr/jct/doc end user/documents/5 Geneva/wg11/JCTVC-E266-v4.zip (March 2011) 3Steinar Midtskogen, “Improved chroma prediction” IETF draft-midtskogen-netvc-chromapred-02 https://tools.ietf.org/html/draft-midtskogen-netvc-chromapred-02 (October 2016) 4Nathan E. Egge and Jean-Marc Valin, “Predicting Chroma from Luma with Frequency Domain Intra Prediction”, https://people.xiph.org/~unlord/spie cfl.pdf (April 2016) 5When model fitting is performed in the decoder, the original chroma planes cannot be used. 6Complexity concerns related to model fitting during the decoding process was one of the reasons for the removal of LM Mode in HEVC. 3
  • 4.
    Description of theProposed CfL
  • 5.
    Computing the CfLscaling factor (encoder only) moz://a For an intra prediction block, where the uv mode is DC PRED, α is computed using the reconstructed luma pixels, the original chroma pixels and a prediction block level DC PRED7 ◦ average Downsampling8 − ◦ − Least Squares ◦ Reconstructed luma pixels Original Chroma pixels Chroma DC PRED pixels α 7DC PRED is computed over the entire prediction block size, not over the transform size. 8This step is required if the sequence is not 4:4:4 4
  • 6.
    CfL prediction (Encoderand Decoder) moz://a The CfL prediction is computed using the reconstructed luma pixels9 , the prediction block-level DC PRED and the dequantized α value. ◦ average Downsampling10 − ◦ × +◦ Reconstructed luma pixels Signaled α Chroma DC PRED pixels CfL prediction 9Resampling is required for 4:2:0 (4 adds and 1 shift per chroma pixel) 10This step is required if the sequence is not 4:4:4 5
  • 7.
    Vector Quantization ofα Cb and α Cr moz://a Both chromatic αs are vector quantized. The vector quantization table11 is trained using k-means (Lloyd’s algorithm). 11The quantization table is normative, but could be added to the bitstream 6
  • 8.
    Signaling moz://a Both αCband αCr are combined into a symbol of a 16 value CDF The sign of each α is signaled using a bit12 ◦ ◦ ind = aom read symbol() sign[0] = aom read bit() sign[1] = aom read bit() ◦ block skipped mbmi→skipped == 0 cfl alpha codes[ind][0] != 0 cfl alpha codes[ind][1] != 0 12Both signs could be signaled using a single symbol 7
  • 9.
    Results (AWCY) moz://a AV1with default experiments enabled vs. AV1 with CfL Dataset: Subset 1 (50 4:2:0 still images) BD-Rate with respect to PSNR PSNR Cb PSNR Cr PSNR HVS SSIM MS SSIM CIEDE 200013 1.2589 -15.5532 -13.6577 1.4392 1.3205 1.2999 -4.4521 https://arewecompressedyet.com/?job=master%402017-03-27T18%3A41%3A56.236Z&job= CfL Double DC PRED%402017-04-15T01%3A48 13CIEDE2000 is the only metric that considers both luma and chroma planes 8
  • 10.
  • 11.
    AV1 moz://a Plane: Blue-differencechroma plane (Cb) PSNR Cb: 41.0378 dB QP=55 Sequence: Washington Monument, Washington, D.C. 04037u original.y4m (subset1) Analyzer link: https://goo.gl/69N6LC 9
  • 12.
    AV1 + CfLmoz://a Plane: Blue-difference chroma plane (Cb) PSNR Cb: 42.9614 dB QP=55 Sequence: Washington Monument, Washington, D.C. 04037u original.y4m (subset1) Analyzer link: https://goo.gl/69N6LC 10
  • 13.
    AV1 moz://a PSNR Cb:40.9118 dB PSNR Cr: 41.3498 dB QP=55 Sequence: US Navy 111117-N-UB993-082 A Sailor examines a patient during drill.y4m (subset1) Analyzer link: https://goo.gl/NdLyzu 11
  • 14.
    AV1 + CfLmoz://a PSNR Cb: 41.9660 dB PSNR Cr: 42.4291 dB QP=55 Sequence: US Navy 111117-N-UB993-082 A Sailor examines a patient during drill.y4m (subset1) Analyzer link: https://goo.gl/NdLyzu 12
  • 15.
  • 16.
    Work remaining beforeproposal for adoption moz://a • Improve quantization tables and probability tables • Optimize code book size • Optimize encoder α selection to improve coding efficiency • Use multisymbol to signal sign bits • 4:2:2 support 13
  • 17.
    Ongoing research moz://a •Tuning the luma-chroma balance to, e.g., reduce or eliminate the loss in luma PSNR (for a smaller CIEDE gain) • Add the code book to the bitstream • CfL in inter (similar to Thor) 14