Technical Challenges and Opportunities for Live VR

Technical Challenges &
Opportunities for Live
VR
JULES DAVIS – CTO FOCAL POINT VR

Video: Teleportation to Reality

Live Video: Presence at an Event

What is Presence?
 Wikipedia: It is defined as a person's subjective
sensation of being there in a scene depicted by a
medium
 Michael Abrash: “Presence is VR Magic…it engages
you at a deeper, more visceral level than any other
form of entertainment”

Presence Requirements
Feature VR Today Human Perception
Field of View (per eye) ~80° x 90° 160° x 130°
Acuity (pixels / degree) 12 - 18 ~60 (and True HD)
Resolution (per eye) ~1k x 1k ~10k x 8k
Refresh Rate 90 Hz 120 Hz ?
Tracking / latency 5 - 20 ms 4 ms ?
Michael Abrash at Steam Dev Days 2014
http://media.steampowered.com/apps/abrashblog/Abrash%20Dev%20Days%202014.pdf

Video Mechanics - Capture
Samsung Beyond iZugur Z63DC
Google Jump / GoPro Odyssey

Capture
Left Eye Only
12 Camera GoPro Rig
5 pairs horizontally
1 up and 1 down

Stitching / Projection
 Stitch images together
 To map onto a sphere surrounding viewer
 Just like map projection in geography
 Most common is equirectangular projection

Broadcast and Playback
 Upload stitched video to cloud
 Download or stream video to headset
 Project video onto a sphere and project

Live Virtual Reality Video
VR camera Video Processor Cloud VR PlayerBroadcast

Market changing fast
 Capture
 Huge variety of cameras
 No camera meets all needs
 Next VR, Nokia, Samsung, GoPro, Ricoh, Kodak, Sphericam, Vuze
 Stitching and Projection
 Some cameras have it built in
 Video-Stich have Vahana
 Broadcast
 YouTube, Facebook and many video streaming companies

Videos Today
 Max resolution 4k x 2k video
 Mix of mono and stereo
 Almost all using equirectangular

Challenges
 Many choice
 Capture quality
 Dynamic range
 Resolution / Bandwidth
 Head Movement
 Stereo Quality
 …

Challenge 1 – Resolution and Bandwidth

Resolution / Bandwidth
 4k video is normally 3840 x 2160 x 8 bit
 H.264 good quality 18 mbps
 Bandwidth for 1 hour of video at 18 mbps
 60 * 60 * 18 / 8 = 8 gigabytes
 For 100,000 viewers
 8 GB * 100000 = 800 terabytes
 Bandwidth might be 5p per GB
 Cost = 0.05 * 670000 = £40,000
 20x cost of equivalent SD broadcast (4x 1080p)

Target is Headset Resolution
 Gear VR has highest pixel density
 H.FoV = 72.9° & H.Res = 1280
~17.5 pixels per degree
 Target resolution ~6.3k x 3.2k per eye
 Many H.264 codecs won’t handle this
 4K video on Gear VR gives
 ~10.5 pixels per degree horizontally
 ~5.4 vertically

Resolution / Bandwidth
Simulation of 4k video displayed on native Gear VR

Technical Challenge #1
Bandwidth / Resolution
 Native headset resolution video in Stereo
 Equivalent quality to 18 Mbps 4K video
 But at much lower bandwidth – ideally 3-4 Mbps

Look for Redundancy
 Native resolution 17.5 pixels per degree
 Equirectangular texture = 6.3k x 3.2k x 2
 Notice how stretched it is at poles

Look for redundancy – Projection
 Native resolution for Gear VR
 6k x 3k x 2 => ~40 Megapixels for stereo pair
 Actual pixels needed is much less
 Surface of a sphere with circumference 6k (res ² / π)
 24.5 Megapixels (~60% extra pixels wasted)

Why use equirectangular?
 Pros
 Plenty of software out there to generate it
 Fairly simple to render
 Creates one continuous rectangular array
 Simple for highly optimised video codecs
 Cons
 requires 60% extra pixels to achieve equivalent quality
 Big distortion – straight lines become curves
 Video codecs optimised for straight lines
 Rendering artefacts caused by non-linearity

Are there alternatives?
 Cube-maps?
 + Minimal distortion – straight lines stay straight
 + Hardware accelerated rendering
 - nearly 2x pixels of ideal minimum
 Pyramids?
 Facebook have blogged about pyramids
 Cube-maps in disguise
 5 planar projections instead of 6
 Compress more efficiently
 Problem is as old as astronomy

Optimise Equirectangular?
 Too much horizontal resolution at poles
 Resolution is about 2x above 60 degrees
 Chop the top and bottom off and half their width

Optimise Equirectangular
 Halve width of polar regions
 Removes 30/180 of image => 5/6 * 40 = ~34 Megapixels
 Now we’re only 35% worse than ideal
 General lesson
 We can divide sphere into regions
 Change projection and resolution

Can we do better?
 Divide into multiple regions
 Remove down
 Vary resolution
 Base on projection
 And area of interest

Other Options
 Lot’s of redundancy between left / right eye
 Stereo aware compression as in 3D movies
 Reduction can be as much to 60%
 Viewer often cares about one direction much more than another
 Broadcast of this event, screens and speaker more important
 Give them more bandwidth
 Reduce resolution of off directions or reduce codec quality
 Send area around direction user is looking
 Minimise switching latency
 Better codecs
 H.265/HEVC – 50% if you’re lucky

The Future
 This Year
 1k x 1k per eye
 3 years
 2k x 2k per eye (4k screens here now)
 5+ years
 4k x 4k per eye (wider field of view?)
 Human vision Target per eye
 8k x 8k may be sufficient?

Stereo VR Videos
 Effectively a video for each eye
 Parallax comes from camera positioning
 Packed vertically (left = top, right = bottom)
 Much stronger sense of presence

Stereo Vision
 Replace eyes by cameras

Stereo Vision
 Turn camera around head centre of rotation

Stitch and Project
 Add a camera top and bottom
 Stitch all the left eyes together
 Stitch all the right eyes together
Stereo Vision

Truth about 3D VR Video
 Creates a convincing sense of depth
 Increases sense of presence
 This is good. Yay!

 Up and down are mono
 Unavoidably – look up, turn 90°, look up again
 Effective Stereo separation varies with viewing
angle

 No toe in
 Humans eyes track together
 Don’t look straight forward
 This impacts all VR for now

 Camera is fixed position
 Don’t move your head
 Camera pairs fixed separation orientation
 Don’t roll/tip your head

 Camera positions fixed
 Position
 Roll
 IPD (based on view angle)
 Perfect when eyes aligned with camera
 Less perfect elsewhere
 More cameras and clever processing can improve
 Still limited by fixed view in each half of stereo video

What can be done?
 Need more 3D information
 Depth and Occlusion
 Reconstruct view each frame

Reconstruction
 With depth and occlusion (geometry)
 Generate right eye from left
 Correct stereo for up and roll
 Reconstruct different positions and orientations
 Some head movement

Practical?
 Challenging computer vision problem
 Probably not full-scene in real-time yet
 Multiple inward facing cameras
 Motion capture suites
 Potentially laser scan fixed scene in advance
 Capture foreground objects live
 Examples from Hololens, 8i and others
 Specular lighting difficult to reconstruct

Technical Challenges and Opportunities for Live VR

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Recently uploaded

Recently uploaded (20)

Technical Challenges and Opportunities for Live VR

Editor's Notes