Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video Engines | SIGGRAPH 2019 Technical Sessions

425 views

Published on

Explore the proposed Metadata for Immersive Video (MIV) standard specification. MIV enables real-world content captured by cameras to be viewed by users with Six Degrees of Freedom (6DoF) movement, similar to a VR experience with synthetic content.

Published in: Technology
  • Be the first to comment

Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video Engines | SIGGRAPH 2019 Technical Sessions

  1. 1. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST Bringing the future of entertainment to your living room: MPEG-I Immersive Video Jill Boyce
  2. 2. 2 ImmersiveMedia • With the recent resurgence in VR technologies, there has been rekindled interest in creating VR-like experiences with real events • Real camera-captured content vs. computer-generated synthetic content
  3. 3. • Most solutions require dramatically more data than traditional video • Efficient compression of immersive media is a vital ingredient • How to distribute immersive media content to the devices in your living room or beyond? • Immersive video requires dramatically more data than traditional video • Efficient compression of immersive media is a vital ingredient, so that it can be distributed over networks BringingImmersiveMediatoyou
  4. 4. MPEGcodecstandardsaimtocompressand distributemedia • Capture and display are required in the end-to-end system, but generally out of scope of MPEG standards 4 Capture Encode DisplayDecodeNetwork Scope of MPEG codec standards Only bitstream format defined
  5. 5. PAST • Multi-view video (including stereo) • AVC (2009) • HEVC (2015) • 360 Video (w/ 3 DoF) • AVC (2019) • HEVC (2018) • Omnidirectional MediA Format (OMAF) version 1 (2019) FUTURE • Point Cloud Coding – Video (V-PCC) • Expected early 2020 • Point Cloud Coding – Graphics (G-PCC) • Expected mid 2020 • Immersive Video (MIV), formerly called 3DoF+ • Expected mid 2020 • OMAF version 2 MPEGBeyond2Dvideo 5
  6. 6. MPEG-I:“I”isforimmersive 1. Architectures for Immersive Media (Technical Report) 2. Omnidirectional Media AF 3. Versatile Video Coding 4. Immersive Audio 5. Point Cloud Coding - Video 6. Immersive Media Metrics 7. Immersive Media Metadata 8. Network-Based Media Processing 9. Point Cloud Coding – Graphics 10. Carriage of Point Cloud Data 11. Implementation Guidelines for Network-based Media Processing 12. Immersive Video 6
  7. 7. 7 360Video=3Degreesoffreedom • Viewer selects the viewport based on orientation of HMD (or mouse input) § Viewer is at a fixed position in the center of the sphere looking out § 3 Degrees of Freedom (3 DoF): Yaw, Pitch, Roll Field-of-view of viewport represents only small portion of the full sphere • 360° Video represents a sphere (360°x180°) or portion of a sphere § Captured by cameras, with multiple lenses, capturing a wider field-of-view than is viewed at a particular time § May be stereoscopic or monoscopic No motion parallax
  8. 8. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST MPEG:StagesofImmersion 3DoF (360° video) 3DoF+ windowed 6DoF 6DoF Viewer can change (yaw, pitch, roll) orientation but not position Viewer can change (yaw, pitch, roll) orientation, and small (head-scale) change to (x, y, z) position Viewer can change (yaw, pitch, roll) orientation, and change (x, y, z) position Viewer can change (yaw, pitch, roll) orientation, and change (x, y, z) position, but constrained view area
  9. 9. Immersivevideovs.Virtualreality • VR games let you experience a virtual world with 6 degrees of freedom • The virtual world is represented with a 3D model, e.g. mesh • Immersive video lets you remotely experience a real, camera captured 3D scene with 6 degrees of freedom • Can be consumed on a variety of devices: • VR headset • Lightfield display • Mono or stereo 2D screen with view position/orientation selection or detection • The range of viewer motion is limited to the range capture by cameras • Immersive video can also be used to represent synthetic content from 3D models rendered remotely with extremely high quality/complexity 9
  10. 10. CapturingImmersiveMediaContent: Insideoutvs.outsidein 360° Video: Multiple lenses inside-out, stitched together Intel Sports FreeD Lytro Intel Sports FreeD 8i Lytro 3 DoF+: Multiple nearby lenses 8i 6 DoF Outside-in
  11. 11. Metadataforimmersivevideo(MIV) • Codec input is texture + depth at multiple camera positions • Cameras may be omnidirectional or perspective • Enables 6DoF viewer playback within range of camera captured volume • Utilizes existing HEVC codecs • Call for Proposals responses reviewed at March 2019 meeting • First Working Draft of standard specification April 2019 • Aiming for technical completion mid-2020 11 Technicolor Museum
  12. 12. Technicolor Painter m40010 and m40011 90 frames (30 fps) 16 (4X4 camera array, baseline ~68 mm) 2048 × 1088 MPEG-I3DOF+TestSequences 𝒗𝟎 𝒗𝟕 𝒗𝟏𝟒 m43748 and m44914 300 frames (30 fps) 15X1 camera array baseline 36.75 mm 1920 × 1080 IntelFrog
  13. 13. MPEG-I3DOF+TestSequences TechnicolorMuseum TechnicolorHijack ClassroomVideo Sequence TechnicolorMuseum TechnicolorHijack ClassroomVideo Input contributions m42349 m42349 m42415 + m42756 Length & frame rate 300 frames (30 fps) 300 frames (30 fps) 120 frames (30 fps) Number of source views 24 10 15 Source view resolution 2048 × 2048 4096 × 4096 4096 × 2048 View FoV & mapping 180° × 180° ERP 180° × 180° ERP 360° × 180° ERP Global FoV 360° × 180° 180° × 180° 360° × 180°
  14. 14. TMIV System Coded Camera Parameters Metadata Encoder Bitstream Viewport Viewing Position & Orientation G H Metadata Parser Reference RendererAtlas patch occupancy map generator TMIV Encoder MIV Decoder HEVC encoder pair (T+D) HEVC decoder pair (T+D) Source view pair (T+D) Red box shows standardization scope
  15. 15. AtlasesPruning & patch selection Texture #0 Depth #0 Patch 8 Patch 5 Texture #1 Depth #1 Patch 5 Patch 8 Patch 2 Patch 3 Patch 7 Patch 7 Patch 3 View representations Patch 2 Encoder – Atlas Constructor View0 View1 View2
  16. 16. ExamplemIVposetrace 16 Atlases of patches for compression
  17. 17. Intel media: Low power, high performance, dedicated fixed-function HEVC video decoder hardware Intel graphics: Proprietary view synthesis algorithms for improved video quality, high performance Intelhardwareimplementation 17
  18. 18. SIGGRAPH 2019 | LOS ANGLES | 28 JULY - 1 AUGUST 18 DEMO
  19. 19. • Subtitle Copy Goes Here

×