Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What’s new in MPEG?

What's new in MPEG? A brief update about the results of its 131st MPEG meeting featuring:
- Welcome and Introduction: Jörn Ostermann, Acting Convenor of WG11 (MPEG)
- Versatile Video Coding (VVC): Jens-Rainer Ohm and Gary Sullivan, JVET Chairs
- MPEG 3D Audio: Schuyler Quackenbusch, MPEG Audio Chair
- Video-based Point Cloud Compression (V-PCC): Marius, Preda, MPEG 3DG Chair
- MPEG Immersive Video (MIV): Bart Kroon, MPEG Video BoG Chair
- Carriage of Versatile Video Coding (VVC) and Enhanced Video Coding (EVC): Young-Kwon Lim, MPEG Systems Chair
- MPEG Roadmap: Jörn Ostermann, Acting Convenor of WG11 (MPEG)

MPEG Web site: https://mpeg-standards.com/meetings/mpeg-131/

  • Be the first to comment

What’s new in MPEG?

  1. 1. What’s new in MPEG? Webinar | July 21, 2020 | 10:00 UTC and 21:00 UTC Jörn Ostermann MPEG Convenor Versatile Video Coding Video-based Point Cloud Compression MPEG 3DAudio MPEG Roadmap Carriage of VVC and EVC MPEG Immersive Video Further Information: https://bit.ly/mpeg131 Bart Kroon MPEG Video Marius Preda MPEG 3DG Young-Kwon Lim MPEG Systems Gary Sullivan JVET Jens-Rainer Ohm JVET Schuyler Quackenbush MPEG Audio MPEG Web Site: https://mpeg-standards.com/meetings/mpeg-131/
  2. 2. Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 Finalization of Versatile Video Coding Webinar, 21 July 2020 Gary Sullivan and Jens-Rainer Ohm JVET Co-chairs
  3. 3. Documents approved in recent meeting • Versatile Video Coding (JVET-S2001) – Twin text: ITU-T H.266 | ISO/IEC 23090-3 – Description of bitstream syntax and semantics, processes for core decoding and high-level syntax as necessary for decoding • Versatile SEI messages for coded video bitstreams (JVET-S2007) – Twin text: ITU-T H.274 | ISO/IEC 23002-7 – Independent SEI messages and VUI, specification not needed for core decoding process, could be used with VVC or other video standards • Test Model 10 of Versatile Video Coding (VTM 10) (JVET-S2002) – Encoder and algorithm description – Has corresponding software implementation • Draft 4 of VVC conformance testing (JVET-S2008) • VVC verification test plan (v3) (JVET-S2009)
  4. 4. VTM9 compared to HEVC-HM, "common test conditions" (CTC) Random Access is most important in storage, streaming, broadcast • UHD average >40% (PSNR) – both luma and chroma • Reasonable complexity tradeoff Random Access Over HM-16.20 Y U V EncT DecT Class A1 −38.74% −37.19% −44.34% 884% 186% Class A2 −43.13% −39.74% −38.35% 999% 199% Class B −34.74% −46.77% −44.61% 935% 189% Class C −29.90% −30.58% −32.56% 1212% 199% Class E Overall −35.93% −39.13% −40.09% 1004% 193% Class D −27.64% −26.48% −26.11% 1326% 194% Class F −41.55% −44.78% −46.09% 689% 163% Performance of VVC (PSNR)
  5. 5. Visual Subjective Performance of VVC • Test with non-expert viewers, sequences not included in CTC (from preparation of verification test) • Notable: Visual results seem to be better for VVC than when measured by PSNR (from JVET-S0246)
  6. 6. Versatility of VVC Video Applications • Designed for a wide variety of types of video • Camera captured, computer-generated, and mixed content – Screen sharing – Adaptive streaming – Game streaming – Video with scrolling text, etc. • Standard and high dynamic range (emphasis on 10 bit video) • Various colour formats, including 4:4:4 and wide gamut • 360° video with various projection map types • Multiview video (including depth maps) • MPEG’s video-based point cloud compression • Lossless coding support
  7. 7. Special Features with High-Level Syntax • Flexible access mechanisms, including localized access using “subpictures” • Extraction and merging at bitstream level • Special boundary handling for gradual refresh and 360° video • Layered coding, including low-complex scalability operation • Nested temporal sublayering • Predictive reference picture resampling • Wavefront parallel processing similar to HEVC, with less CTU row delay • General constraints information: Mechanism to identify tool usage at high level
  8. 8. Overview of coding tools • Partititioning: Multi-type tree (Quad/binary/ternary) • Intra prediction using – more directional modes (incl. wide angles), DC and planar – sample smoothing with various adaptation methods (position dependent) – inheritance of chroma modes and chroma sample prediction from luma – multi-line prediction, matrix weighted prediction • Inter prediction using advanced MV coding, affine models, sub-block and geometric/diagonal partitioning, decoder side motion refinement (three tools named DMVR, BDOF, PROF) • Combined intra/inter prediction • Switchable primary and secondary transforms • New adaptive loop filter based on local classification, in-loop amplitude mapping stage, additional elements in deblocking • Quantization with log step size switching (& trellis-based dependent quantization) • Context-adaptive arithmetic coding with various improvements • Support for screen content (intra block copy, palette mode, transform skip) and lossless and near-lossless coding
  9. 9. • Document archives (publicly accessible, >10k docs) – http://phenix.int-evry.fr/jvet – http://wftp3.itu.int/av-arch/jvet-site – http://phenix.int-evry.fr/jct – http://wftp3.itu.int/av-arch/jctvc-site • Software for VVC-VTM, HEVC-HM, and 360° Video (publicly accessible): – https://jvet.hhi.fraunhofer.de/ – https://hevc.hhi.fraunhofer.de/ – https://jvet.hhi.fraunhofer.de/svn/svn_360Lib/ Obtain documents and software
  10. 10. ARL audio research labs MPEG-H 3D Audio Baseline Profile Schuyler Quackenbush MPEG Audio Chair 1
  11. 11. ARL audio research labs MPEG-H 3D Audio - Introduction • MPEG-H 3D Audio standard was finalized in 2015, specifying the Low Complexity Profile • The Low Complexity Profile enables delivery of: – Channels and Objects – Higher-Order Ambisonics (HOA). • Audio Objects are a key component in enabling advanced personalization options in broadcast applications – Dialog enhancement – Language selection 2
  12. 12. ARL audio research labs MPEG-H 3D Audio – New Profile • In July 2019, industry requested a new profile dedicated to broadcast, streaming and streaming immersive music applications. • In July 2020, WG11 (MPEG) announces the completion of Amendment 2 on 3D Audio which specifies the new Baseline Profile addressing this industry request. 3
  13. 13. ARL audio research labs MPEG-H 3D Audio Baseline Profile • Tailored for broadcast, streaming, and high-quality immersive music delivery, the Baseline profile: 4 Baseline Profile Advanced Coding: Channels and Objects Loudness Control and DRC Rendering and Downimx Rich Metadata Set Personalization and Interactivity Accessibility and Dialog Enhancement Seamless Configuration Changes Sample Accurate Ad-insertion and Splicing … Low Complexity Profile HOA LPD DRC – Dynamic Range Control LPD – Linear Prediction Domain – Supports Channels and Objects. – Is a subset of the Low Complexity profile. – Supports all advanced broadcast and streaming features
  14. 14. ARL audio research labs MPEG-H 3D Audio Baseline Profile • In addition, the Baseline Profile: – Enables the use of up to 24 audio objects in Level 3 for high quality immersive music delivery. – Can be signaled in a backwards compatible fashion, such that Baseline Profile bitstreams will be decoded by all MPEG-H enabled devices that support either one of the two profiles 5
  15. 15. ARL audio research labs 3D Audio Baseline Profile Verification Test Report • Reports on the results of five subjective listening tests assessing the performance of the 3D Audio Baseline Profile. • Covers a wide range of bit rates and immersive audio use cases • The tests were conducted in nine different test sites: – Dolby, ETRI, Fraunhofer IIS, Gaudio, NHK, Nokia, Orange, Qualcomm and Sony • With a total of: – 341 listeners – 1,144,592 subjective scores 6 Public Document
  16. 16. ARL audio research labs 3D Audio Baseline Profile Verification Test Report • Three Tests achieve "Excellent" quality on the MUSHRA scale: – Test 2: 11.1 or 7.1 channels at 512 kb/s to 256 kb/s rate – Test 3: 7.1, 5.1 and 2.0 channels at 256 kb/s to 48 kb/s rate – Test 4: Content as Test 2, but binauralized for headphones at 384 kb/s • Two Tests achieve "ITU-R High-Quality Emission" quality – Test 1 "Ultra-HD Broadcast": 22.2 channels at 768 kbs – Test 5 "High-Quality Immersive Music Delivery": 24 audio objects coded at 1.5 Mb/s, presented as 11.1 (7.1 + 4H) loudspeakers 7
  17. 17. ARL audio research labs 360 Reality Audio Music Service • 360 Reality Audio music can be enjoyed by consumers using: – Tidal, – Deezer, – Nugs.net, – Amazon Music HD and – Sony Select (China). 8https://www.sony.com/electronics/360-reality-audio https://www.amazon.com/music/unlimited/why-hd?ref=dmm_LP_WHYHD
  18. 18. Point Cloud Compression in MPEG MPEG 131st Press Release, ISO/IEC FDIS 23090-5 Visual Volumetric Video-based Coding and Video-based Point Cloud Compression July 2020 Institut Polytechnique de Paris, FRANCE Marius PREDA MPEG 3D Graphics Chair
  19. 19. Point Cloud A set of 3D points • not ordered, • without relations between them Each point is defined by • (X, Y, Z) • (R, G, B) or (Y, U, V) • eflec ance, an a enc ,
  20. 20. Point Clouds
  21. 21. Sport viewing with point clouds 360° backgroun d 3D objects 1-3 Gbps per object
  22. 22. Point Cloud 800,000 points -> 1 000 Mbps (uncompressed) Compression is required in order to make PC useful
  23. 23. Very sparse occupancy of the 3D space - (usually) the objects are represented by their surface and not by volumes - In 2D a pixel has 8 neighbors, in 3D - 26 and many of them are transparent Point Cloud Compression basic principles
  24. 24. 2014 2015 2016 2017 2018 2019 2020 MPEG initiated the work on PCC G-PCC 10/2020 V-PCC 07/2020 First Committee Draft issued in October 2018 In April 2017 MPEG issued a Call for Proposals 9 technology leading companies responded and MPEG evaluated them in October 2017 0 5 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Patch generation Packing Geometry image generation Texture image generation Occupancy map compression Image padding Compressed bitstream Input point cloud frame Occupancy map Auxiliary patch-info compression Patchinfo Texture images Geometry images Padded geometry images Padded texture images Compressed geometry video Compressed Texture video multiplexer Compressed occupancy map Compressed auxiliary patch information Reconstructed geometry imagesSmoothing Video Compression Smoothed geometry V-PCC Video- based PCC G-PCC Geometry- based PCC Point Cloud Compression in MPEG
  25. 25. Video-based Point Cloud Compression Main ideas: (1) a point coordinate is encoded as a distance with respect to a particular plane inspired from he displacemen mapping in Graphics Pixel intensity Vertex Height
  26. 26. Video-based Point Cloud Compression Main ideas: (2) the color (or any attribute) associated to a 3D vertex is encoded in a 2D texture inspired from he e re mapping in Graphics Vertex color Pixel color
  27. 27. Video-based Point Cloud Compression Projecting all the points on a single plane would result to several 3D points having the same 2D projection - > several depth values should be stored per pixel
  28. 28. Video-based Point Cloud Compression Projecting per patch is preferred: - A set of points (patch) in a small neighborhood is projected on the same plane - The set of projection planes is very limited - 6 faces of the cube - 4 additional diagonal planes
  29. 29. Video-based Point Cloud Compression Encoding the 3D point clouds as a set of 2D patches Geometry Color (Attributes)
  30. 30. Video-based Point Cloud Compression Encoding the 3D point clouds as a set of 2D patches - For enforcing lossless, the missed points are encoded separately =
  31. 31. Video-based Point Cloud Compression Encoding the 3D point clouds as a set of 2D videos: depth, color and occupancy maps MPEG is very good in video coding! Problem solved
  32. 32. Video-based Point Cloud Compression Encoding 3D point clouds as a set of 2D videos: color, depth and occupancy map 100,000 points @ 30fps 360 Mbps (uncompressed) 1 Mbps (MPEG PCC 2020) 7 Mbps 4.4 Mbps
  33. 33. Video-based Point Cloud Compression V-PCC implementations publicly available Integrated real-time decoder and renderer source code is also available for Android, Windows & Linux www.mpeg-pcc.org
  34. 34. Video-based Point Cloud Compression Beyond V-PCC ISO/IEC FDIS 23090-5 Visual Volumetric Video-based Coding and Video- based Point Cloud Compression Visual Volumetric Video-based Coding is an MPEG framework for 3D to 2D projection based coding technologies - used by V-PCC - used by MIV (MPEG Immersive Video) - to be used for future projects (Dynamic Mesh Coding)
  35. 35. 30 organizations 90 authors MPEG PCC contributors
  36. 36. MPEG Immersive Video (MIV) ISO/IEC 23090-12 Bart Kroon bart.kroon@philips.com
  37. 37. MPEG Immersive Video (MIV) ISO/IEC 23090-12 • Schedule for MIV: • MPEG 131 – July 2020 – CD • MPEG 133 – Jan. 2021 – DIS • MPEG 135 – July 2021 – FDIS • Schedule for V3C and V-PCC (2nd ed.): • MPEG 132 – Oct. 2020 – CD • MPEG 133 – Jan. 2021 – DIS • MPEG 135 – July 2021 – FDIS • The MIV committee draft references FDIS 23090-5 (1st ed.) Video-based Visual Volumetric Coding (V3C) Video-based Point Cloud Coding (V-PCC) MPEG Immersive Video (MIV)
  38. 38. Example encoder source 15 views (photorealistic): • 4K × 2K • 360° equirectangular projection (ERP) • Geometry (=depth range) • Texture attribute (YCbCr)
  39. 39. Example encoder source 16 physical cameras: • 2K × 1K • Perspective projection • Geometry (=depth range) • Texture attribute (YCbCr) 210 mm 210 mm
  40. 40. MIV codec model Multi-view video • Geometry (G) • Texture attribute (T) • View parameters MIV encoder MIV decoder/ renderer V3C bitstream Reconstruction Viewing space Viewport video • (Geometry) • Texture attribute Original T G Atlas Complete (basic) view Patches from additional views
  41. 41. Bitstream structure V3C unit stream V3C parameter set V3C unit V3C unit V3C unit V3C unit Sub bitstream V3C unit Access unit Access unit Access unit… V3C unit Access unit Access unit Access unit… Sub bitstreams: • Common atlas data (has view parameters) • Multiple atlases: • Geometry video data • Attribute video data • Atlas data (has patch parameters)
  42. 42. Test model – Encoder Attribute video data Geometry video data Parameters Camera data Format Bitstream (V3C sample stream with MIV extensions) Source views View parameters Geometry video data Attribute video data Pack patches Into atlases Geometry video data (raw) Attribute video data (raw) Encode video sub bitstreams (HEVC) Atlas data Parameter set View parameters list Bitstream (one file) Multiplex Automatic parameter selection (geometry quality, basic/additional views, atlas frame sizes) SEI messagesSEI messages Prune views (Flag redundant pixels) (Simplified)
  43. 43. Test model – Decoder/renderer Filter out blocks Color code Core processes Filter viewport Decoded access unit (all conformance points) Patch culling Pruned view reconstruction View synthesis Inpainting Viewing space handling Viewport Viewport parameters Geometry upscaling (Simplified)
  44. 44. Discussion • Flexible standard for multiview video with depth: • Video codec agnostic (e.g. HEVC, VVC, …) • MIV Main uses a subset of V3C • Extensible with more V3C features (multiple attributes, occlusion video data, SEI messages, etc.) • MIV-specific extensions (coding per group of views, auxiliary patches, object-based coding, etc.) • Please participate: • Test model: https://gitlab.com/mpeg-i-visual/tmiv • Test material (14 sequences) available on request
  45. 45. Carriage of VV and EVC in MPEG Systems Youngkwon Lim Chair of MPEG Systems young.L@Samsung.com
  46. 46. 2 What is carriage? Video Coding Standard ISO/IEC 13818-1 MPEG-2 Systems Delivery over MPEG-2 TS ISO/IEC 14496-15 NAL File Format Storage and Delivery over ISOBMFF ISO/IEC 14496-12 ISO Base Media File Format ISO/IEC 23008-12 Image file format Storage and Delivery over ISOBMFF as a image or image sequence ISO/IEC 23000-19 Common Media Application Format Brands definition for CMAF Segments with a specific video codec ISO/IEC 23009-1 Media Presentation Description and Segment Formats MPEG-DASH extension for a specific video codec ISO/IEC 23000-22 Multi Image Application Format Brands definition for Image File Format with a specific video codec ISO/IEC 23008-1 MPEG Media Transport Delivery over MMT
  47. 47. 3 13818-1 MPEG-2 Systems ISO/IEC 13818-1:2019 AMD 2 Carriage of VVC in MPEG-2 TS • Current Stage : DAM • ETA for Final Stage : 2021/04 • Features • VVC data alignment with PES packets • VVC video descriptor and VVC HRD descriptor • Constraints on transport of VVC bitstream • T-STD extension for single layer VVC and layered temporal video subsets ISO/IEC 13818-1:2019 AMD 3 Carriage of EVC in MPEG-2 TS and update of the MPEG-H 3D Audio descriptor • Current Stage : CDAM • ETA for Final Stage : 2021/07 • Features • EVC data alignment with PES packets • descriptors carrying metadata for EVC elementary streams • constraints for the transport of EVC elementary streams • the T-STD buffer model for EVC elementary streams
  48. 48. 4 14496-15 NAL File Format • Current Stage : DAM • ETA for Final Stage : 2021/04 • VVC related features • definition of sample, sub-sample, sync sample, decoder configuration record and etc. • storage format for single-layer VVC (ISO/IEC 23090-3) video streams • storage of multiple layers in one track or each layer/sub-layer in its own track • storage format for VVC bitstreams with more than one layer. • EVC related features • definition of sample, sub-sample, sync sample, decoder configuration record and etc. • storage format for single-layer EVC video streams ISO/IEC 14496-15:2019 AMD 2 Carriage of VVC and EVC in ISOBMFF
  49. 49. 5 23008-12 Image file format • Current Stage : CDAM • ETA for Final Stage : 2021/07 • VVC related features • definition of image item, sub-sample item, VVC operating points information property, subpicture items and etc. • definition of VVC image sequences • definition of VVC-specific brands, vvic for image and image collections and vvcc for image sequence • EVC related features • definition of image item, sub-sample item and etc. • Definition of EVC-specific brands • evbi and evbs for EVC baseline profile image and image sequence, respectively • evmi and evms for EVC main profile image and image sequence, respectively ISO/IEC 23008-12:2017 AMD 3 Support for VVC, EVC, slideshows and other improvements
  50. 50. 6 23008-1 MPEG Media Transport • Current Stage : CDAM • ETA for Final Stage : 2021/07 • Features • Use of CMAF track constraints for MPU ISO/IEC 23008-1 3rd edition AMD 2 Carriage of EVC in MMT
  51. 51. Thank you! Questions?
  52. 52. 2020 2021 2022 2023 2024 Jan 2025Jan 2019 129 133 137 141 145 149130 131 132 134 135 136 138 139 140 142 143 144 146 147 148125 126 127 128MPEG meeting: VVC Extensions (Machine Learning) Media Coding MIV v.2 6 DoF Audio Scene Description v.2Scene Description VersatileVideo Coding MPEG ImmersiveVideo (MIV) Neural Network Compression for Multimedia EssentialVideo Coding (MPEG-5) Low Complexity EnhancementVideo PCC Systems Support Genome Annotation Compression Geometry PCC v.2Geometry Point Cloud Compression (G-PCC) CMAF v.2 Colour Support in Open Font Format Partial File Format v.2 Dynamic Mesh Compression OMAF v.3OMAF v.2 Video Decoding Interface Beyond Media Network-Based Media Processing NBMP v.2 VCM v.2Video Coding for Machines Systems andTools Genome Compression Genome Compression v.2 VisualVolumetricVideo-Based Coding (V3C) NNC for Multimedia v.2

×