1



     Trends and challenges
        in video coding
                 Prof. Dr. Touradj Ebrahimi
                 VISNET-II Summer School
           KOC University, Istanbul, June 15-19 2009




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
2
                                    Outline

•  First things first…
•  Trends in video coding
•  Challenges in video coding
•  Some last words…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
3
                                    Outline

•  First things first…
•  Trends in video coding
•  Challenges in video coding
•  Some last words…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
4
                            First things first…

•  Often the future is not the result of one,
   but many trends occurring in parallel,
   which at times, when interacting, can lead
   to results not easily predictable when
   considering only one or a subset of such
   trends




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
5
                            First things first…

•  Is there a Moore’s law of compression?




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
6
                            First things first…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
7
                            First things first…

•  Not only between different technologies…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
8
                            First things first…

•  … but also for a same technology
                              MPEG-2 video




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
9
                                    Outline

•  First things first…
•  Trends in video coding…
•  Challenges in video coding
•  Some last words…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
10
                        Trends in video coding

•  Trends in video coding are influenced by:
    –  Trends in the type/nature of content
    –  Trends in technologies/products
    –  Trends in applications/services




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
11
                 Trends in type/nature of content

•  Explosion in all dimensions:
    –  Spatial resolution (QCIF to HD to UHD)
    –  Temporal resolution (25 to 60 to 200 Hz)
    –  Spatial dimensions (2D to 3D to Holography)
    –  Number of components (Y to RGB to RG1G2B)
    –  Dynamic range of each component (8 to 16 bpp
       to floating point)
•  Increasing number of movies use computer
   graphics generated content

Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
12
                 Trends in technologies/products

•  Capture
•  CPU/DSP
•  Communication channels
•  Storage
•  Display/Printing




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
13
                 Trends in technologies/products




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
14
                 Trends in technologies/products




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
15
                     Trends in applications/services

•  Prosumer (producer/consumer) models
   →  Social networks: Youtube/Facebook/Twitter
   →  …
•  New types of access to content
   →  Podcasting
   →  P2P
   →  IPTV
   →  …




   Multimedia Signal Processing Group
   Swiss Federal Institute of Technology, Lausanne
16
                          Trends in video coding

•  Evolutions of existing architectures/tools
•  New tools in existing/extended architectures
•  Disruptive architectures/tools




  Multimedia Signal Processing Group
  Swiss Federal Institute of Technology, Lausanne
17
          Evolutions of existing architectures/tools

•  An example… H.265/KTA in ITU-T




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
18


•  H.265 is a long-term video coding standard, ‘launched’ by ITU-T VCEG.
•  Not yet formalized but VCEG keeps seeking proposals and information regarding the
   possibility of a major performance gain to justify the step from H.264 to H.265.
•  Though the necessary scope of H.265 is yet largely to be determined, it is agreed that
   among the goals will be:
    –  High coding efficiency, e.g., two times compared with H.264/AVC
    –  Computational efficiency, considering both encoder and decoder
    –  Loss/error robustness
    –  Network friendliness
•  So far, contributions to VCEG have mainly focused on improving coding
   efficiency.
•  To better evaluate these contributions and retain progress, the KTA (Key
   Technical Area) has been developed as the software platform, using JM11 as
   the baseline and continuously integrating promising tools.


      Multimedia Signal Processing Group
      Swiss Federal Institute of Technology, Lausanne
19


•  Inter prediction
   –  Adaptive interpolation filter (AIF)
         2-D non-separable AIF (AD08, AE16)
         Separable AIF (COM16-C219, AG10)
         Directional AIF (DAIF) (AG21, AG22, AH17, AH18)
         Enhanced DAIF (E-DAIF) (AI12, COM16-C125, COM16-C126)
         Enhanced AIF (EAIF) (C464, AI38, AJ30)
         Switch interpolation filters with offsets (SIFO) (C463, AI35, AJ29, COM16-C126)
         High precision filter (HPF) (AI33)
         Single-pass encoding (AJ29, AK26)

   –  1/8-pel motion compensated prediction (MCP) (AD09)
   –  Extended MCP block size (COM16-C123)
   –  Competition-based MV prediction (AC06)


  Multimedia Signal Processing Group
  Swiss Federal Institute of Technology, Lausanne
20



•  Transform and Quantization
  –  Mode-dependent directional transform (MDDT) (AF15,
       AG11, AH20, AJ24, AI36)
  –    Very large block transform (COM16-C123)
  –    Adaptive prediction error coding (APEC) (AB06, AD07,
       AE15)
  –    Adaptive quantization matrix selection (AQMS) (AC07,
       AD06, AF08, AI19)
  –    Rate-distortion optimized quantization (RDO-Q) (AH21)




  Multimedia Signal Processing Group
  Swiss Federal Institute of Technology, Lausanne
21



•  Entropy coding
   –  Parallel CABAC (COM16-C405, AI32)
•  In-loop filter
   –  Block-based adaptive loop filter (BALF) (AI18, AJ13)
   –  Quadtree-based adaptive loop filter (QALF) (COM16-
      C181, AK22)
•  Post filter (AI34, COM16–C128)
•  Internal bit depth increasing (IBDI) (AE13, AF07)



  Multimedia Signal Processing Group
  Swiss Federal Institute of Technology, Lausanne
22
      New tools in existing/extended architectures…

•  An example… FTV in MPEG




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
23


•  Synthesize a continuum of views based on a limited set of views
•  Specify a format that fixes the rate, but allows arbitrarily large
   number of views to be rendered




   Multimedia Signal Processing Group
   Swiss Federal Institute of Technology, Lausanne
24


•  Extend MVC framework to include multi-view video plus depth




      Multimedia Signal Processing Group
      Swiss Federal Institute of Technology, Lausanne
25




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
26
                 Disruptive architectures/tools…

•  An example… Compressive Sensing




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
27


                                                     The signal x is compressible if
                                                      the α representation has just a
                                                      few large coefficients and
                                                      many small coefficients.
                                                  from Baraniuk, dsp.rice.edu/cs




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
28




                                                        y=Φx=ΦΨs=Θs




•  Compressive sensing addresses the traditional inefficiencies by
  directly acquiring a compressed signal representation without going
  through the intermediate stage of acquiring N samples. The
  measurement process is not adaptive, meaning that Φ is fixed and
  does not depend on the signal x.


      Multimedia Signal Processing Group
      Swiss Federal Institute of Technology, Lausanne
29
                    Other trends in video coding




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
30
                                      X-lets

•  Better exploit 2D (nD) singularities




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
31
                   ‘rugby’ 4.7 Mbit/s AVC/H.264




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
32
    ‘rugby’ 4.7 Mbit/s AVC/H.264 + texture synthesis




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
Efforts in next generation image/video coding   33
                         standardization
•  JPEG: Advanced Image Coding – AIC
•  MPEG: High performance Video Coding – HVC
•  VCEG: Next Generation Video Coding – NGVC




 Potential mergers/synergies in some of the above
 efforts are under discussion …



  Multimedia Signal Processing Group
  Swiss Federal Institute of Technology, Lausanne
34
                                    Outline

•  First things first…
•  Trends in video coding
•  Challenges in video coding
•  Some last words…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
35
                Content representation challenge

•  Which representation can potentially result in huge
   coding gains
    –  Xlets
    –  Compressive sensing
    –  Texture analysis/synthesis
    –  …
•  What video coding schemes perform best to compress
   new content type
    –  Ultra High Definition
    –  3D
    –  HDR
    –  Holography
    –  …

Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
36
                    Visual perception challenge

•  How to measure quality in 2D/3D video?
    –  Subjective quality assessment methodologies
    –  Objective quality metrics
    –  Quality of Experience
•  How to inject some more efficient perceptual
   coding tools in video coding?
    –  Perceptual focus of attention
    –  Perceptual masking
    –  Perceptual pre-/post- processing
    –  …


Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
37
                        Applications challenge

•  What are the killer applications that
   require alternative video compression
   methods with significant added value?
    –  P2P
    –  Low cost/power encoders
    –  Coding schemes reducing stream switching
       delay
    –  Coding schemes taking into account potential
       post-processing/interaction by users
    –  …
Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
38
                             Other challenges

•  Video often has an audio that goes with it:
    –  How to take advantage of AV correlation
•  How to take better advantage of source/channel/
   network synergies and interactions
•  How to take advantage of context in video
   coding
•  How to take better advantage of computer
   vision, content annotation, search and retrieval
   in video compression
•  …

Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
39
                                    Outline

•  First things first…
•  Trends in video coding
•  Challenges in video coding
•  Some last words…




Multimedia Signal Processing Group
Swiss Federal Institute of Technology, Lausanne
40
                                Some last words

•  Prediction is very difficult, especially about the future
   –  Niels Bohr (1885-1962): Physics Nobel Prize Winner 1922
•  Scientific and technological considerations are not the
  only factors which will decide the future of video coding
   –  Intellectual property complexities
   –  Policies
   –  Industrial/Economic interests
   –  …
•  Content is still The King!




   Multimedia Signal Processing Group
   Swiss Federal Institute of Technology, Lausanne
41



                             Thanks for your attention !
                             Questions, discussions, …




Acknowledgement goes to many identified and unidentified individuals from whom some of the
materials presented here come from …

   Multimedia Signal Processing Group
   Swiss Federal Institute of Technology, Lausanne

Trends and challenges in video coding

  • 1.
    1 Trends and challenges in video coding Prof. Dr. Touradj Ebrahimi VISNET-II Summer School KOC University, Istanbul, June 15-19 2009 Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 2.
    2 Outline •  First things first… •  Trends in video coding •  Challenges in video coding •  Some last words… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 3.
    3 Outline •  First things first… •  Trends in video coding •  Challenges in video coding •  Some last words… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 4.
    4 First things first… •  Often the future is not the result of one, but many trends occurring in parallel, which at times, when interacting, can lead to results not easily predictable when considering only one or a subset of such trends Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 5.
    5 First things first… •  Is there a Moore’s law of compression? Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 6.
    6 First things first… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 7.
    7 First things first… •  Not only between different technologies… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 8.
    8 First things first… •  … but also for a same technology MPEG-2 video Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 9.
    9 Outline •  First things first… •  Trends in video coding… •  Challenges in video coding •  Some last words… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 10.
    10 Trends in video coding •  Trends in video coding are influenced by: –  Trends in the type/nature of content –  Trends in technologies/products –  Trends in applications/services Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 11.
    11 Trends in type/nature of content •  Explosion in all dimensions: –  Spatial resolution (QCIF to HD to UHD) –  Temporal resolution (25 to 60 to 200 Hz) –  Spatial dimensions (2D to 3D to Holography) –  Number of components (Y to RGB to RG1G2B) –  Dynamic range of each component (8 to 16 bpp to floating point) •  Increasing number of movies use computer graphics generated content Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 12.
    12 Trends in technologies/products •  Capture •  CPU/DSP •  Communication channels •  Storage •  Display/Printing Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 13.
    13 Trends in technologies/products Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 14.
    14 Trends in technologies/products Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 15.
    15 Trends in applications/services •  Prosumer (producer/consumer) models →  Social networks: Youtube/Facebook/Twitter →  … •  New types of access to content →  Podcasting →  P2P →  IPTV →  … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 16.
    16 Trends in video coding •  Evolutions of existing architectures/tools •  New tools in existing/extended architectures •  Disruptive architectures/tools Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 17.
    17 Evolutions of existing architectures/tools •  An example… H.265/KTA in ITU-T Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 18.
    18 •  H.265 isa long-term video coding standard, ‘launched’ by ITU-T VCEG. •  Not yet formalized but VCEG keeps seeking proposals and information regarding the possibility of a major performance gain to justify the step from H.264 to H.265. •  Though the necessary scope of H.265 is yet largely to be determined, it is agreed that among the goals will be: –  High coding efficiency, e.g., two times compared with H.264/AVC –  Computational efficiency, considering both encoder and decoder –  Loss/error robustness –  Network friendliness •  So far, contributions to VCEG have mainly focused on improving coding efficiency. •  To better evaluate these contributions and retain progress, the KTA (Key Technical Area) has been developed as the software platform, using JM11 as the baseline and continuously integrating promising tools. Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 19.
    19 •  Inter prediction –  Adaptive interpolation filter (AIF)   2-D non-separable AIF (AD08, AE16)   Separable AIF (COM16-C219, AG10)   Directional AIF (DAIF) (AG21, AG22, AH17, AH18)   Enhanced DAIF (E-DAIF) (AI12, COM16-C125, COM16-C126)   Enhanced AIF (EAIF) (C464, AI38, AJ30)   Switch interpolation filters with offsets (SIFO) (C463, AI35, AJ29, COM16-C126)   High precision filter (HPF) (AI33)   Single-pass encoding (AJ29, AK26) –  1/8-pel motion compensated prediction (MCP) (AD09) –  Extended MCP block size (COM16-C123) –  Competition-based MV prediction (AC06) Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 20.
    20 •  Transform andQuantization –  Mode-dependent directional transform (MDDT) (AF15, AG11, AH20, AJ24, AI36) –  Very large block transform (COM16-C123) –  Adaptive prediction error coding (APEC) (AB06, AD07, AE15) –  Adaptive quantization matrix selection (AQMS) (AC07, AD06, AF08, AI19) –  Rate-distortion optimized quantization (RDO-Q) (AH21) Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 21.
    21 •  Entropy coding –  Parallel CABAC (COM16-C405, AI32) •  In-loop filter –  Block-based adaptive loop filter (BALF) (AI18, AJ13) –  Quadtree-based adaptive loop filter (QALF) (COM16- C181, AK22) •  Post filter (AI34, COM16–C128) •  Internal bit depth increasing (IBDI) (AE13, AF07) Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 22.
    22 New tools in existing/extended architectures… •  An example… FTV in MPEG Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 23.
    23 •  Synthesize acontinuum of views based on a limited set of views •  Specify a format that fixes the rate, but allows arbitrarily large number of views to be rendered Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 24.
    24 •  Extend MVCframework to include multi-view video plus depth Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 25.
    25 Multimedia Signal ProcessingGroup Swiss Federal Institute of Technology, Lausanne
  • 26.
    26 Disruptive architectures/tools… •  An example… Compressive Sensing Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 27.
    27 The signal x is compressible if the α representation has just a few large coefficients and many small coefficients. from Baraniuk, dsp.rice.edu/cs Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 28.
    28 y=Φx=ΦΨs=Θs •  Compressive sensing addresses the traditional inefficiencies by directly acquiring a compressed signal representation without going through the intermediate stage of acquiring N samples. The measurement process is not adaptive, meaning that Φ is fixed and does not depend on the signal x. Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 29.
    29 Other trends in video coding Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 30.
    30 X-lets •  Better exploit 2D (nD) singularities Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 31.
    31 ‘rugby’ 4.7 Mbit/s AVC/H.264 Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 32.
    32 ‘rugby’ 4.7 Mbit/s AVC/H.264 + texture synthesis Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 33.
    Efforts in nextgeneration image/video coding 33 standardization •  JPEG: Advanced Image Coding – AIC •  MPEG: High performance Video Coding – HVC •  VCEG: Next Generation Video Coding – NGVC Potential mergers/synergies in some of the above efforts are under discussion … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 34.
    34 Outline •  First things first… •  Trends in video coding •  Challenges in video coding •  Some last words… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 35.
    35 Content representation challenge •  Which representation can potentially result in huge coding gains –  Xlets –  Compressive sensing –  Texture analysis/synthesis –  … •  What video coding schemes perform best to compress new content type –  Ultra High Definition –  3D –  HDR –  Holography –  … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 36.
    36 Visual perception challenge •  How to measure quality in 2D/3D video? –  Subjective quality assessment methodologies –  Objective quality metrics –  Quality of Experience •  How to inject some more efficient perceptual coding tools in video coding? –  Perceptual focus of attention –  Perceptual masking –  Perceptual pre-/post- processing –  … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 37.
    37 Applications challenge •  What are the killer applications that require alternative video compression methods with significant added value? –  P2P –  Low cost/power encoders –  Coding schemes reducing stream switching delay –  Coding schemes taking into account potential post-processing/interaction by users –  … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 38.
    38 Other challenges •  Video often has an audio that goes with it: –  How to take advantage of AV correlation •  How to take better advantage of source/channel/ network synergies and interactions •  How to take advantage of context in video coding •  How to take better advantage of computer vision, content annotation, search and retrieval in video compression •  … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 39.
    39 Outline •  First things first… •  Trends in video coding •  Challenges in video coding •  Some last words… Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 40.
    40 Some last words •  Prediction is very difficult, especially about the future –  Niels Bohr (1885-1962): Physics Nobel Prize Winner 1922 •  Scientific and technological considerations are not the only factors which will decide the future of video coding –  Intellectual property complexities –  Policies –  Industrial/Economic interests –  … •  Content is still The King! Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne
  • 41.
    41 Thanks for your attention ! Questions, discussions, … Acknowledgement goes to many identified and unidentified individuals from whom some of the materials presented here come from … Multimedia Signal Processing Group Swiss Federal Institute of Technology, Lausanne